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METHODS FOR IDENTIFYING PATHWAY-SPECIFIC 
REPORTERS AND TARGET GENES, AND USES THEREOF 

5 1. INTRODUCTION 

The present invention relates to methods for identifying one or more reporter 
genes for a particular biological pathway of interest. The reporter genes of this invention 
are particularly useful for analyzing the activity of particular biological pathways of 
interest, and may be further used in the design of drugs, drug therapies or other biological 
10 agents (e.g., insecticides, herbicides, fungicides, antibiotics, or antivirals) to target a 

particular biological pathway. The present invention also relates to methods for identifying 
one or more target genes for a particular biological pathway of interest. Target genes of the 
invention are useful as specific targets for drug which may be designed to enhance, inhibit, 
or modulate a particular biological pathway. Methods to identify gene which modifies the 
1 5 function or structure of a member (e.g., compound or gene product) of a particular 
biological pathway are provided. 

The present invention provides examples of reporter genes and/or target 
genes which have been discovered by the methods of the invention. Specifically, the 
inventors have made the surprising discovery that five S. cerevisiae genes (previously of 
20 unknown function) form clustered co-regulated sets of genes and are reporters of the 

ergosterol-pathway. The methods of the invention are also exemplified in that the inventors 
have specifically discovered six S. cerevisiae reporter genes of the protein kinase C (PKC) 
pathway. Two of these genes are also novel target genes of the PKC pathway and provide 
targets for the development of PKC pathway-specific drugs, drug therapies, or other related 
25 biological or therapeutical agents. The methods of the invention are further exemplified by 
the discovery of four novel reporter genes of the S. cerevisiae Invasive Growth pathway. 
One of these genes also serves as a target gene in the Invasive Growth pathway, and may be 
used to develop Invasive Growth pathway-specific drugs, drug therapies, or other related 
biological or therapeutical agents. 

30 

2. BACKGROUND OF THE INVENTION 

Citation of a reference herein shall not be construed as an admission that 
such reference is prior art to the present invention. 

35 
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2.1. MICROARRAY TECHNOLOGY 

Within the past decade, several technologies have made it possible to monitor the 
expression level of a large number of transcripts at any one time (see, e.g., Schena et al, 
1995, Quantitative monitoring of gene expression patterns with a complementary DNA 
5 micro-array, Science 270:467-470; Lockhart et al. , 1996, Expression monitoring by 

hybridization to high-density oligonucleotide arrays, Nature Biotechnology 14:1675-1680; 
Blanchard et al., 1996, Sequence to array: Probing the genome's secrets, Nature 
Biotechnology 14, 1649; U.S. Patent 5,569,588, issued October 29, 1996 to Ashby et al. 
entitled "Methods for Drug Screening"). In organisms for which the complete genome is 
10 known, it is possible to analyze the transcripts of all genes within the cell. With other 
organisms, such as human, for which there is an increasing knowledge of the genome, it is 
possible to simultaneously monitor large numbers of the genes within the cell. 

Such monitoring technologies have been applied to the identification of 
genes which are up regulated or down regulated in various diseased or physiological states, 
15 the analyses of members of signaling cellular states, and the identification of targets for 
various drugs. See, e.g., Friend and Hartwell, International Publication W098/38329 dated 
September 3, 1993; Stoughton and Friend, U.S. Patent Application Serial No. 09/074,983, 
filed on filed on May 8, 1998; Friend and Hartwell, U.S. Provisional Application Serial No. 
60/056,109, filed on August 20, 1997; Friend and Stoughton, U.S. Provisional Application 
20 Serial Nos. 60/084,742 (filed on May 8, 1998), 60/090,004 (filed on June 19, 1998) and 
60/090,046 (filed on June 19, 1998), all incorporated herein by reference for all purposes. 

Levels of various constituents of a cell are known to change in response to 
drug treatments and other perturbations of the cell's biological state. Measurements of a 
plurality of such "cellular constituents" therefore contain a wealth of information about the 
25 effect of perturbations and their effect on the cell's biological state. Such measurements 
typically comprise measurements of gene expression levels of the type discussed above, but 
may also include levels of other cellular components such as, but by no means limited to, 
levels of protein abundances, or protein activity levels. The collection of such 
measurements is generally referred to as the "profile" of the cell's biological state. 
30 The number of cellular constituents is typically on the order of a hundred 

thousand for mammalian cells. The profile of a particular cell is therefore typically of high 
complexity. Any one perturbing agent may cause a small or a large number of cellular 
constituents to change their abundances or activity levels. Thus, identifying the particular 
cellular constituents are associated with a particular biological pathway, provides a difficult 
35 and challenging task. Additionally, methods in the art do not provide a means by which all 
of the cellular constituents which are associated with a particular pathway of interest may be 
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identified. Therefore, there is a need in the art for methods to identify groups of cellular 
constituents, which are associated with a particular biological pathway. 

2.1.1. THE NEED FOR REPORTER GENES 

5 In order to monitor and study a particular biological pathway, it is necessary 

to have a "read-out" or reporter of the pathway that allows measurement of an alteration of 
the pathway. Many biological pathways, however, do not have reliable reporters associated 
with them. There is a need in the art for a method to identify reporters for a particular 
biological pathway of interest. Additionally, there is a need in the art for novel reporter 

10 genes which may be assigned to a particular biological pathway. The present invention 
provides such a reporters and methods of identifying such reporters. 

2.1.2. IDENTIFICATION OF TARGETS 

Identification of targets for drug development is a laborious process that.has 
15 had a low rate of success. Accordingly, there is a need in the art for novel targets for the 
development of novel drugs and therapies against biologic pathogens of interest. There is 
also a need in the art for novel targets for the development of novel drugs and therapies 
which can enhance, inhibit, or modulate a particular biological pathway of interest. 
Additionally, there is a need in the art for a method of screening potential drug targets that 
20 affords high throughput and the ability to assess multiple targets simultaneously. The 
present invention provides such a targets and methods to identify such targets. 

2.2. FUNGI AND DISEASE 

Fungi are eukaryotic microorganisms comprising a phylogenetic kingdom. 
25 The Kingdom Fungi is estimated to contain over 100,000 species and includes species of 
"yeast", which is the common term for several families of unicellular fungi. 

Although fungal infections were once unrecognized as a significant cause of 
disease, the extensive spread of fungal infections is a major concern in hospitals, health ^ 
departments and research laboratories. According to a 1988 study nearly 40% of all deaths 
30 from hospital-acquired infections were caused by fungi, not bacteria or viruses (Sternberg, 
S., 1994, Science 266:1632-34). 

Immunocompromised patients are particularly at risk of fungal infections. 
Patients with impaired immune systems due to AIDS, cancer chemotherapy, or those treated 
with immunosuppressive drugs used to prevent rejection in organ transplant are common 
35 hosts for fungal infections. Organisms including Cryptococcus, Candida, Histoplasma, 
Coccidioides, and many as 150 species of fungi have been linked to human or animal 
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diseases (Sternberg, S.. 1994, Science 266:1632-34). Under immunocompromised 
conditions, fungi that are normally harmless to the host when maintained in the 
gastrointestinal system, can be transferred to the bloodstream, eyes, brain, heart, kidneys, 
and other tissues leading to symptoms ranging in severity from white patches on the tongue, 
5 to fever rupturing of the retina, blindness, pneumonia, heart failure, shock, or sudden 
catastrophic clotting of the blood (Sternberg, S., 1994, Science 266:1632-34) In 
susceptible burn victims, even baker's yeast, common in the human mouth and normally 
non-virulent, can lead to severe infection (Sternberg, S., 1994, Science 266:1632-34). 
Hospital transmission may also occur via catheters or other invasive equipment (Sternberg, 

10 S., 1994, Science 266:1632-34). 

Fungal infections are not limited to individuals with compromised immune 
systems Geological and meteorological events have been reported to trigger fungal 
outbreaks. Following a 1994 earthquake in California, tremors were estimated to have 
released infectious fungal spores from the soil triggering a 3-year statewide epidemic that 

15 lead to more than 4500 cases per year (Sternberg, S., 1994, Science 266:1632-34). 

Similarly, environmental cycles of droughts and heavy rains are believed to be associated 
with release infectious spores leading to epidemic infections (Sternberg, S., 1994, Science 

266:1632-34). e 
The widespread dissipation of fungal infection coupled to the recognition of 

20 fungi as a significant disease factor creates an increasing need for antifungal agents 

Existing antifungal therapies harbor many disadvantages as discussed in Section 2.1.2, and 
novel therapies and targets for therapy are needed. 

2 21 ANTIFUNGAL AGENTS AND 
" * NEED FOR IMPROVEMENTS 

25 A useful antifungal agent must be toxic to the parasite, but not to the host. 

One way to achieve this goal is to target a structure or pathway that is unique to the 
pathogen. For example, successful antibacterial therapies often take advantage of the 
differences between the prokaryotic bacteria and the eukaryotic host. However since 
fungal pathogens, like human cells, are eukaryotic, it has been more difficult to identify 
30 ^ticagentsmatareuniquetothepathogen. Among me target, explore to date are 
theblchemical pathways for (1) membrane integrity; (2) ergosterol synthesis (review^ 
Handbook of Experimental Pharmacology, 1990, Springer-Verlag, Heidelberg, JF Ryley, 
eds.); (3) nucleic acid synthesis; and (4)cell wall synthesis. 

However, antifungal agents and drugs currently used to treat fungal 
35 pamogensarelackinginbothefficacyandsafety. To date, only a limited number of 
therapeutic agents are available for the treatment of fungal infections. These drugs, 
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however, often prove to be toxic to the host, or are accompanied by severe side effects. The 
commonly prescribed drug, Amphotericin B, a mainstay of antifungal therapy, includes 
such side effects as fever, chills, low blood pressure, headache, nausea, vomiting, 
inflammation of blood vessels and kidney damage (Sternberg, S., 1994, Science 266:1632- 
5 34). Further, many of the existing therapies act to inhibit or slow fungal growth, but do not 
kill the infecting fungi. 

3. SUMMARY OF THE INVENTION 

The present invention relates to methods for identifying one or more reporter 
10 genes for a particular biological pathway of interest. The reporter genes of this invention 
are particularly useful for analyzing the activity of particular biological pathways of 
interest, and may be further used in the design of drugs, drug therapies or other biological 
agents (e.g., insecticides, herbicides, fungicides, antibiotics, or antivirals) to target a 
particular biological pathway. The present invention also relates to methods for identifying 
1 5 one or more target genes for a particular biological pathway of interest. Target genes of the 
invention are useful as specific targets for drug which may be designed to enhance, inhibit, 
or modulate a particular biological pathway. Methods to identify gene which modifies the 
function or structure of a member (e.g., compound or gene product) of a particular 
biological pathway are provided. 
20 The present invention provides examples of reporter genes and/or target 

genes which have been discovered by the methods of the invention. Specifically, the 
inventors have made the surprising discovery that five S. cerevisiae genes (previously of 
unknown function) form clustered co-regulated sets of genes and are reporters of the 
ergosterol-pathway. The methods of the invention are also exemplified in that the inventors 
25 have specifically discovered six 5. cerevisiae reporter genes of the protein kinase C (PKC) 
pathway. Two of these genes jure also novel target genes of the PKC pathway and provide 
targets for the development of PKC pathway-specific drugs, drug therapies, or other related 
biological or therapeutical agents. The methods of the invention are further exemplified by 
the discovery of four novel reporter genes of the S. cerevisiae Invasive Growth pathway. 
30 One of these genes also serves as a target gene in the Invasive Growth pathway, and may be 
used to develop Invasive Growth pathway-specific drugs, drug therapies, or other related 
biological or therapeutical agents. 

The invention provides a method of identifying a reporter gene for a 
particular biological pathway in a cell comprising identifying a gene which clusters to a 
35 geneset associated with the biological pathway, wherein said gene which clusters to the 
geneset associated with the particular biological pathway is a reporter gene. 

-5- 
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In one embodiment the geneset associated with the particular biological 
pathway is identified by a method comprising identifying one or more genes in a geneset 
which are associated with the particular biological pathway, wherein said geneset having 
one or more genes associated with the particular biological pathway is a geneset associated 
5 with the particular biological pathway. 

In another embodiment the geneset associated with the particular biological 
pathway is identified by identifying a geneset which is activated or inhibited by 
perturbations which target the biological pathway, wherein a geneset which is activated or 
inhibited by perturbations which target the biological pathway is a geneset associated with 
10 the particular biological pathway. 

In one embodiment the method further comprises identifying a gene which 
clusters specifically to a geneset associated with the particular biological pathway, wherein 
said gene which clusters specifically to the geneset associated with the particular biological 

pathway is a reporter gene. 
15 in one embodiment me reporter gene is further identified as a gene whose 

expression is not altered by perturbations which effect other biological pathways, said other 
biological pathways being different from said particular biological pathway. 

In another embodiment the geneset is provided by a method comprising: (a) 
measuring changes in expression of a plurality of genes in the cell in response to a plurality 
20 of perturbations to the cell; and (b) grouping or re-ordering said plurality of genes into one 
or more co-varying sets, wherein said one or more co-varying sets comprise said geneset. 
In a further embodiment said plurality of genes are grouped or re-ordered into one or more 
co-varying sets by means of a pattern recognition algorithm. In another embodiment the 
pattern recognition algorithm is a clustering algorithm. In a further embodiment the 
25 clustering algorithm analyzes arrays or matrices, said arrays or matrices representing said 
measured changes in expression of the plurality of genes in the cell in response to the 
plurality of perturbations to the cell, wherein said analysis determines dissimilarities 

between individual genes. 

In one embodiment the plurality of perturbations to the cell are also grouped 
30 or re-ordered according to their similarity. In another embodiment said plurality of 
perturbations to the cell are grouped or re-oredered by means of a pattern recognition 
algorithm. In a further embodiment the pattern recognition algorithm is a clustering 
algorithm. 

In one embodiment of the invention, the clustering algorithm analyzes arrays 
35 or matrices, said arrays or matrices representing said measured changes in expression of the 
plurality of genes in the cell in response to the plurality of perturbations to the cell. In 
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another embodiment the reporter gene is further identified as has a high level of induction. 
In another embodiment the expression of the reporter gene is further identified to change by 
at least a factor of two in response to perturbations of the particular biological pathway. 

In a further embodiment expression of the reporter gene is further identified 
5 to change by at least a factor of 10 in response to perturbations to the particular biological 
pathway. In another embodiment the expression of the reporter gene is further identified to 
change by at least a factor of 100 in response to perturbations to the particular biological 
pathway. 

In one embodiment the expression of the reporter gene is further identified to 
10 change in response to slight perturbations to the particular biological pathway. 

In another embodiment the perturbation to the particular biological pathway 
comprises exposure to a drug, and said reporter gene is further identified to change in 
response to low levels of exposure to the drug. 

In one embodiment the reporter gene is further identified to respond to 
15 perturbations targeted to the entire particular biological pathway. In one embodiment the 
reporter gene is further identified to respond to perturbations directed to one or more 
portions of the particular biological pathway. In another embodiment the reporter gene is 
further identified to respond to perturbations targeted to early steps of the particular 
biological pathway. In another embodiment the reporter gene is further identified to 
20 respond to perturbations targeted to late steps of the particular biological pathway. In yet 
another embodiment the reporter gene is further identified by identifying a gene which 
kinetically induces quickly in response to perturbations to the particular biological pathway. 

In another embodiment the reporter gene is further identified by identifying a 
gene which reaches steady state within about eight hours after a perturbation to the 
25 particular biological pathway. In a further embodiment the reporter gene is further 

identified by identifying a gene which reaches steady state within about six hours after a 
perturbation to the particular biological pathway. In another embodiment the reporter gene 
is further identified by identifying a gene which is induced within about two hours after a 
perturbation to the particular biological pathway. 
30 in still another embodiment the reporter gene is further identified by 

identifying a gene which is induced within about 90 minutes after a perturbation to the 
particular biological pathway. In another embodiment the reporter gene is further identified 
by identifying a gene which is induced within about 60 minutes after a perturbation to the 
particular biological pathway. In a further embodiment the reporter gene is further 
35 identified by identifying a gene which is induced within about 30 minutes after a 

perturbation to the particular biological pathway. In one embodiment the reporter gene is 
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further identified by identifying a gene which is induced within about 10 minutes after a 
perturbation to the particular biological pathway. In another embodiment the reporter gene 
is further identified by identifying a gene which is induced within about 7 minutes after a 
perturbation to the particular biological pathway. 
5 The invention provides a method of identifying a target gene for a particular 

biological pathway in a cell comprising identifying a gene which clusters to a geneset 
associated with the particular biological pathway, wherein said gene which clusters to a 
geneset associated with the particular biological pathway and is identified as a gene which 
is necessary for normal function of said particular biological pathway. 
10 in one embodiment the geneset associated with the particular biological 

pathway is identified by a method comprising identifying one or more genes in a geneset 
which are associated with the particular biological pathway, wherein said geneset having 
one or more genes associated with the particular biological pathway is a geneset associated 
with the particular biological pathway. In another embodiment the geneset associated with 
15 the particular biological pathway is identified by identifying a geneset which is activated or 
inhibited by perturbations which target the biological pathway, wherein a geneset which is 
activated or inhibited by perturbations which target the biological pathway is a geneset 
associated with the particular biological pathway. 

In one embodiment the genesets are provided by a method comprising-.(a) 
20 measuring changes in expression of a plurality of genes in the cell in response to a plurality 
of perturbations to the cell; and (b) grouping or re-ordering said plurality of genes into one 
or more co-varying sets, wherein said one or more co-varying sets comprise said genesets. 

In one embodiment said plurality of genes are grouped or re-ordered into one 
or more co-varying sets by means of a pattern recognition algorithm. In another 
25 embodiment the pattern recognition algorithm is a clustering algorithm. 

In one embodiment the clustering algorithm analyzes arrays of matrices, said 
arrays or matrices representing said measured changes in expression of the plurality of 
genes in the cell in response to the plurality of perturbations to the cell, wherein said 
analysis determines dissimilarities between individual genes. 
30 fa one embodiment the plurality of perturbations to the cell are also grouped 

or re-ordered according to their similarity. In another embodiment the plurality of 
perturbations to the cell are grouped or re-ordered by means of a pattern recognition 
algorithm. 

In one embodiment the pattern recognition algorithm is a clustering 
35 algorithm. In another embodiment the clustering algorithm analyzes arrays of matrices, said 



-8- 



BNSDOCID: <WO 0058520A1J_> 



WO 00/58520 



PCT/USOO/08555 



arrays or matrices representing said measured changes in expression of the plurality of 
genes in the cell in response to the plurality of perturbations to the cell. 

In one embodiment the reporter gene is a reporter for the 
ergosterol-pathway, and the reporter gene is selected from the group consisting of: 
5 YHR039C (as depicted in FIG.2, as set forth in SEQ ID NO: 1),YLW1 00W (as depicted in 
FIG.4, as set forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ID 
NO:5), YGR131W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as 
depicted in FIG. 10, as set forth in SEQ ID NO:9). 

In another embodiment the reporter gene is a reporter for the PKC-pathway, 
10 and the reporter gene is selected from the group consisting of: SLT2(YHR030C) (as 
depicted in FIG.17A-B, as set forth in SEQ ID NO: 11), YKR161C (as depicted in 
FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in FIG.21A-B, 
as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted in FIG.23A-B, as set forth 
in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID NQ:19), 
15 and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21). 

In another embodiment the reporter gene is a reporter for the Invasive 
Growth pathway, and the reporter gene selected from the group consisting of 
KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), 
PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 
20 depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as depicted in 
FIG.35, as set forth in SEQ ID NO:29). 

N In another embodiment the biological pathway is selected from the group 

consisting of: a signaling pathway, a control pathway, a mating pathway, a cell cycle 
pathway, a cell division pathway, a cell repair pathway, a small molecule synthesis 
25 pathway, a protein synthesis pathway, a DNA synthesis pathway, a RNA synthesis 
pathway, a DNA repair pathway, a stress-response pathway, a cytoskeletal pathway, a 
steroid pathway, a receptor-mediated signal transduction pathway, a transcriptional 
pathway, a translational pathway, an immune response pathway, a heat-shock pathway, a 
motility pathway, a secretion pathway, an endocytotic pathway, a protein sorting pathway, a 
30 phagocytic pathway, a photosynthetic pathway, an excretion pathway, an electrical response 
pathway, a pressure-response pathway, a protein modification pathway, a small-molecule 
response pathway, a toxic-molecule response pathway, and a transformation pathway. 

In one embodiment the target gene of the PKC-pathway is selected from the 
group consisting of: SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID 
35 NO:l 1), and YKR161C (as depicted in FIG.19A-B, as set forth in SEQ ID NO:13). 
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The invention provides a method for determining whether a molecule affects 
the function or activity of an ergosterol pathway in a cell comprising:(a) contacting the cell 
with or recombinant^ expressing within a cell the molecule; and (b) determining whether 
the expression of one or more of the genes selected from the group consisting of: YHR039C 
> (as depicted in FIG.2, as set forth in SEQ ID NO:1),YLW100W (as depicted in FIG.4, as set 
forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ID NO:5), 
YGR131W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as deleted 
in FIG 10 as set forth in SEQ ID NO:9) is changed relative to said expression in the 
absence of the molecule. In a further embodiment the method is a method for determining 
10 whether the molecule inhibits ergosterol synthesis such that a cell contacted with the 
molecule exhibits a lower level of ergosterol than a cell which is not contacted with said 
molecule. In another embodiment step (b) comprises determining whether YPL272c 

expression increases. 

The invention provides a kit comprising in one or more containers a) a 
15 substance selected from the group consisting of an antibody against an ergosterol-pathway 
protein, a gene probe capable of hybridizing to RNA of an ergosterol-pathway gene, and 
pairs of gene primers capable of priming amplification of at least a portion of an 
ergosterol-pathway gene, and b) a molecule known to be capable of perturbing the 
ergosterol pathway. 

20 The invention provides a method for identifying a molecule that activates the 

ergosterol pathway in yeast comprising contacting a yeast cell with one or more candidate 
molecules, and detecting a change in the RNA expression of a reporter gene for the 
ergosterol-pathway relative to the expression of the reporter gene in a yeast cell not 
contacted by the one or more candidate molecules, wherein the reporter gene is selected 
25 from the group consisting of: YHR039C (as depicted in FIG.2, as set forth in SEQ ID 
NOT) YLW100W (as depicted in FIG.4, as set forth in SEQ ID NO:3),YPL272C (as 
depicted in FIG.6, as set forth in SEQ ID NO:5), YGR131W (as depicted in FIG.8, as set 
forth in SEQ ID NO:7), and YDR453C (as depicted in FIG.10, as set forth in SEQ ID 
NO # 9) 

The invention provides a method for identifying a molecule that activates th 
ergosterol pathway in yeast comprising contacting a yeast cell with one or more candidate 
molecules, and detecting a change in the protein expression of a reporter gene for the 
ergosterol-pathway relative to the expression of the reporter gene in a yeast cell not 
contacted by the one or more candidate molecules, wherein the reporter gene is selected 
35 from the group consisting of: YHR039C (as depicted in FIG.2, as set forth in SEQ ID 
NO l) YLW100W (as depicted in FIG.4, as set forth in SEQ ID NO:3),YPL272C (as 
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depicted in FIG.6, as set forth in SEQ ID NO:5), YGR131 W (as depicted in FIG.8, as set 
forth in SEQ ID NO:7), and YDR453C (as depicted in FIG.10, as set forth in SEQ ID 
NO:9). In one embodiment the fixngal cell is a transgenic cell. 

The invention provides a method for identifying a molecule that modulates 
5 the expression of an ergosterol-pathway gene selected from the group consisting of 

YHR039C (as depicted in FIG.2, as set forth in SEQ ID NO:1),YLW100W (as depicted in 
FIG.4, as set forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ID 
NO:5), YGR131 W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as 
depicted in FIG.10, as set forth in SEQ ID NO:9), comprising recombinant^ expressing in a 
10 fungal cell one or more candidate molecules, and detecting the expression of said 

ergosterol-pathway gene; wherein an increase or decrease in the gene expression relative to 
the expression in the absence of candidate molecules indicates that the molecules modulates 
ergosterol-pathway gene expression. In one embodiment the fungal cell is a transgenic cell. 

The invention provides a method for identifying a molecule that modulates 
15 the activity of an ergosterol-pathway protein selected from the group consisting of 

YHR039C (as depicted in FIG.3, as set forth in SEQ ID NO:2), YLW100W (as depicted in 
FIG.5, as set forth in SEQ ID NO:4), YPL272C (as depicted in FIG.7, as set forth in SEQ 
ID NO:6), YGR131 W (as depicted in FIG.9, as set forth in SEQ ID NO:8), and YDR453C 
(as depicted in FIG.l 1, as set forth in SEQ ID NO: 10), comprising contacting a fungal cell 
20 with one or more candidate molecules, detecting said protein; wherein an increase or 
decrease in the protein level relative to the level in the absence of candidate molecules 
indicates that the molecule modulates ergosterol-pathway gene expression. 

The invention provides a method of identifying a molecule that binds to a 
ligand selected from the group consisting of (i) an S. cerevisiae ergosterol-pathway protein 
25 selected from the group consisting of YHR039C (as depicted in FIG.3, as set forth in SEQ 
ID NO:2), YLW100W (as depicted in FIG.5, as set forth in SEQ ID NO:4), YPL272C (as 
depicted in FIG.7, as set forth in SEQ ID NO:6), YGR131W (as depicted in FIG.9, as set 
forth in SEQ ID NO:8), and YDR453C (as depicted in FIG.l 1, as set forth in SEQ ID 
NO: 10), (ii) a fragment of the S. cerevisiae ergosterol-pathway protein, and (iii) a nucleic 
30 acid encoding the S. cerevisiae ergosterol-pathway protein or fragment, the method 
comprising:(a) contacting the ligand with a plurality of molecules under conditions 
conducive to binding between the ligand and the molecules; and (b) identifying a molecule 
within the plurality that binds to the ligand. 

The invention provides a method for determining whether a molecule affects 
35 the function or activity of an PKC pathway in a cell comprising:(a) contacting the cell with, 
or recombinantly expressing within a cell the molecule; and (b) determining whether the 
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expression of one or more of the genes selected from the group consisting of: 
SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C 
(as depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted 
in FIG.21A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted in 
5 FIG.23A-B, as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set 
forth in SEQ ID NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in 
SEQ ID NO:21) is changed relative to said expression in the absence of the molecule. In 
one embodiment step (b) comprises determining whether SLT2 expression increases. 

The invention provides a kit comprising in one or more containers a) a 
10 substance selected from the group consisting of an antibody against a PKC-pathway protein, 
a gene probe capable of hybridizing to RNA of a PKC-pathway gene, and pairs of gene 
primers capable of priming amplification of at least a portion of a PKC-pathway gene, and 
b)a molecule known to be capable of perturbing the PKC pathway. 

The invention provides a method for identifying a molecule that activates 
15 the PKC pathway in yeast comprising contacting a yeast cell with one or more candidate 
molecules, and detecting a change in the RNA expression of a reporter gene for the 
PKC-pathway relative to the expression of the reporter gene in a yeast cell not contacted by 
the one or more candidate molecules, wherein the reporter gene is selected from the group 
consisting of: SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:ll), 
20 YKR161C (as depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) 
(as depicted in FIG.21A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted 
in FIG.23A-B, as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set 
forth in SEQ ID NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in 
SEQH)NO:21). 

25 The invention provides a method for identifying a molecule that activates the 

PKC pathway in yeast comprising contacting a yeast cell with one or more candidate 
molecules, and detecting a change in the protein expression of a reporter gene for the 
PKC-pathway relative to the expression of the reporter gene in a yeast cell not contacted by 
the one or more candidate molecules, wherein the reporter gene is selected from the group 
30 consisting of: SLT2(YHR030C) (as depicted in FIG17A-B, as set forth in SEQ ID NO:l 1), 
YKR161C (as depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) 
(as depicted in FIG.21A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted 
in FIG.23A-B, as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set 
forth in SEQ ID NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in 
35 SEQIDNO:21). In one embodiment me fungal cell is a transgenic cell. 
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The invention provides a method for identifying a molecule that modulates 
the expression of a PKC-pathway gene selected from the group consisting of 
SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C 
(as depicted in FIG.19A-B, as set forth in SEQ ID NO: 13), PIR3(YKL163W) (as depicted 
5 in FIG.21A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted in 

FIG.23A-B, as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set 
forth in SEQ ID NO: 19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in 
SEQ ID NO:21), comprising recombinantly expressing in a fungal cell one or more 
candidate molecules, and detecting the expression of said PKC-pathway gene; wherein an 
10 increase or decrease in the gene expression relative to the expression in the absence of 
candidate molecules indicates that the molecules modulates PKC-pathway gene expression. 
In one embodiment the fungal cell is a transgenic cell. 

The invention provides a method for identifying a molecule that modulates 
the activity of a PKC-pathway protein selected from the group consisting of 
15 SLT2(YHR030C) (as depicted in FIG18, as set forth in SEQ ID NO:12), YKR161C (as 
depicted in FIG.20, as set forth in SEQ ID NO:14), PIR3(YKL163W) (as depicted in 
FIG22, as set forth in SEQ ID NO:16), YPK2(YMR104C) (as depicted in FIG.24, as set 
forth in SEQ ID NO:18), YLR194C (as depicted in FIG.26, as set forth in SEQ ID NO:20), 
and ST1(YDR055W) (as depicted in FIG28, as set forth in SEQ ID NO:22), comprising 
20 contacting a fungal cell with one or more candidate molecules, detecting said protein; 
wherein an increase or decrease in the protein level relative to the level in the absence of 
candidate molecules indicates that the molecule modulates PKC-pathway gene expression. 

The invention provides a method of identifying a molecule that binds to a 
ligand selected from the group consisting of (i) an S. cerevisiae PKC-pathway protein 
25 selected from the group consisting of SLT2(YHR030C) (as depicted in FIG.18, as set forth 
in SEQ ID NO:12), YKR161C (as depicted in FIG.20, as set forth in SEQ ID NO:14), 
PIR3(YKL163W) (as depicted in FIG.22, as set forth in SEQ ID NO: 16), 
YPK2(YMR104C) (as depicted in FIG.24, as set forth in SEQ DD NO:18), YLR194C (as 
depicted in FIG.26, as set forth in SEQ ID NO:20), and ST1(YDR055W) (as depicted in 
30 FIG.28, as set forth in SEQ ID NO:22), (ii) a fragment of the S. cerevisiae PKC-pathway 
protein, and (iii) a nucleic acid encoding the S. cerevisiae PKC-pathway protein or 
fragment, the method comprising:(a) contacting the ligand with a plurality of molecules 
under conditions conducive to binding between the ligand and the molecules; and (b) 
identifying a molecule within the plurality that binds to the ligand. 
3 5 The invention provides a method for determining whether a molecule affects 

the function or activity of an S. cerevisiae Invasive Growth pathway in a cell comprising: 
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(a) contacting the cell with, or recombinantly expressing within a cell the molecule; and (b) 
determining whether the expression of one or more of the genes selected from the group 
consisting of: KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), 
PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 
5 depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as depicted in 
FIG.35, as set forth in SEQ ID NO:29), is changed relative to said expression in the absence 
of the molecule. In one embodiment, step (b) comprises determining whether 
KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), expression 
increases. 

jq The invention provides a kit comprising in one or more containers a) a 

substance selected from the group consisting of an antibody against an S. cerevisiae 
Invasive Growth pathway protein, a gene probe capable of hybridizing to RNA of an 
Invasive Growth pathway gene, and pairs of gene primers capable of priming amplification 
of at least a portion of an Invasive Growth pathway gene, and b)a molecule known to be 
1 5 capable of perturbing the Invasive Growth pathway. 

The invention provides a method for identifying a molecule that activates the 
Invasive Growth pathway in yeast comprising contacting a yeast cell with one or more 
candidate molecules, and detecting a change in the RNA expression of a reporter gene for 
the Invasive Growth pathway relative to the expression of the reporter gene in a yeast cell 
20 not contacted by the one or more candidate molecules, wherein the reporter gene is selected 
from the group consisting of KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ 
ED NO:23), PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), 
YRL042C (as depicted in FIG33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as 
depicted in FIG.35, as set forth in SEQ ID NO:29). 
25 The invention provides a method for identifying a molecule that activates the 

Invasive Growth pathway in yeast comprising contacting a yeast cell with one or more 
candidate molecules, and detecting a change in the protein expression of a reporter gene for 
the Invasive Growth pathway relative to the expression of the reporter gene in a yeast cell 
not contacted by the one or more candidate molecules, wherein the reporter gene is selected 
30 from the group consisting of: KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ 
ID NO:23), PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), 
YRL042C (as depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as 
depicted in FIG.35, as set forth in SEQ ID NO:29). In one embodiment the fungal cell is a 
transgenic cell. 

35 The invention provides a method for identifying a molecule that modulates 

the expression of an Invasive Growth pathway gene selected from the group consisting of 
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KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), 
PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 
depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as depicted in 
FIG.35, as set forth in SEQ ID NO:29), comprising recombinantly expressing in a fungal 
5 cell one or more candidate molecules, and detecting the expression of said Invasive Growth 
pathway gene; wherein an increase or decrease in the gene expression relative to the 
expression in the absence of candidate molecules indicates that the molecules modulates 
Invasive Growth pathway gene expression. In one embodiment the fungal cell is a 
transgenic cell. 

t q The invention provides a method for identifying a molecule that modulates 

the activity of an Invasive Growth pathway protein selected from the group consisting of 
KSS1(YGR040W) (as depicted in FIG.30, as set forth in SEQ ID NO:24), 
PGU1(YJR153W) (as depicted in FIG.32, as set forth in SEQ ID NO:26), YRL042C (as 
depicted in FIG.34, as set forth in SEQ ID NO:28), and SVS1(YPL163C) (as depicted in 
15 FIG.36, as set forth in SEQ ID NO:30), comprising contacting a fungal cell with one or 
more candidate molecules, detecting said protein; wherein an increase or decrease in the 
protein level relative to the level in the absence of candidate molecules indicates that the 
molecule modulates Invasive Growth pathway gene expression. 

The invention provides a method of identifying a molecule that binds to a 
20 Hgand selected from the group consisting of (i) an S. cerevisiae Invasive Growth pathway 
protein selected from the group consisting of KSS1(YGR040W) (as depicted in FIG.30, as 
set forth in SEQ ID NO:24), PGU1(YJR153W) (as depicted in FIG.32, as set forth in SEQ 
ID NO:26), YRL042C (as depicted in FIG.34, as set forth in SEQ ID NO:28), and 
SVS1(YPL163C) (as depicted in FIG.36, as set forth in SEQ ID NO:30), (ii) a fragment of 
25 the S. cerevisiae Invasive Growth pathway protein, and (iii) a nucleic acid encoding the S. 
cerevisiae Invasive Growth pathway protein or fragment, the method comprising (a) 
contacting the ligand with a plurality of molecules under conditions conducive to binding 
between the ligand and the "molecules; and (b) identifying a molecule within the plurality 
that binds to the ligand. 

30 

4. BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 Schematic diagram of the method by which reporter genes and/or 
target genes are identified 

35 FIG. 2 DNA sequence of S. cerevisiae YHR039C ergosterol-pathway gene. 

The nucleic acid sequence of YHR039C is set forth in SEQ ID NO:l. 
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FIG. 3 The amino acid sequence of the protein encoded by S. cerevisiae 
YHR039C ergosterol-pathway gene. The amino acid sequence of YHR039C is set forth in 
SEQ ID NO:2. 

5 FIG. 4 DNA sequence of S. cerevisiae YLR100W ergosterol-pathway gene. 

The nucleic acid sequence of YLR100W is set forth in SEQ ID NO:3. 

FIG. 5 The amino acid sequence of the protein encoded by S. cerevisiae 
YLR100W ergosterol-pathway gene. The amino acid sequence of YLR100W is set forth in 
10 SEQTDNOA 

FIG 6 DNA sequence of 5. cerevisiae YPL272C ergosterol-pathway gene. 
The nucleic acid sequence of YPL272C is set forth in SEQ ID NO:5. 

1 5 FIG 7 The amino acid sequence of the protein encoded by S. cerevisiae 

YPL272C ergosterol-pathway gene. The amino acid sequence of YPL272C is set forth in 
SEQ ID NO:6. 

FIG. 8 DNA sequence of S. cerevisiae YGR131W ergosterol-pathway gene. 
20 The nucleic acid sequence of YGR131W is set forth in SEQ ID NO:7. 

FIG 9 The amino acid sequence of the protein encoded by S. cerevisiae 
YGR131W ergosterol-pathway gene. The amino acid sequence of YGR131W is set forth in 
SEQ ID NO: 8. 

FIG. 10 DNA sequence of S. cerevisiae YDR453C ergosterol-pathway gene. 
The nucleic acid sequence of YDR453C is set forth in SEQ ID NO:9. 

FIG. 1 1 The amino acid sequence of the protein encoded by S. cerevisiae 
30 YDR453C ergosterol-pathway gene. The amino acid sequence of YDR453C is set forth in 
SEQIDNO:10. 

FIG. 12 ErgosterolBiosynthetic Pathway. The various steps in the synthesis 
of ergosterol in S. cerevisiae are shown, beginning with 2 acetyl-CoA. The genes encoding 
35 enzymes in the pathway are shown in green. Antifungal agents that inhibit specific steps in 
the pathway are shown in bold. 
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FIG. 13 Clotrimazole Titration Plot. This plot shows the complexity of the 
drug signature and demonstrates genes which are induced or repressed in response to drug 
treatment. An example of a gene which is induced to a high level is labeled YPL272C. 

FIG. 14 Cluster analysis of ergosterol-pathway genes. When the signature 
of yeast mutant strains deleted in a number of ergosterol-pathway genes are compared 
certain the genes cluster on the same branch. The genes Y4R039C, YLR100W, and 
YGL001C co-clustered and are reporters of the ergosterol-pathway. The genes YPL272C, 
YGR131W, and YDR453C co-clustered and are also reporters of the ergosterol-pathway. 
Clustering analysis of yeast genes reveals relationships between different genes, and 
demonstrates that several genes behave similarly to several known ERG genes. 

FIG. 1 5 PKC pathway of yeast as induced by pheromone or cell wall 
integrety stimulus. 

FIG. 16 Results of two-dimensional cluster analysis which was used in to 
identify the reporter genes and target genes of the PKC pathway. 

FIG. 17A-B DNA sequence of S. cerevisiae SL2(YHR030C) PKC-pathway 
20 gene. The nucleic acid sequence of SL2(YHR030C) is set forth in SEQ ID NO: 1 1 . 

FIG. 18 The amino acid sequence of the protein encoded by S. cerevisiae 
SL2(YHR030C) PKC-pathway gene. The amino acid sequence of SL2(YHR030C) is set 
forth in SEQ ID NO: 12. 

25 

FIG. 19A-B DNA sequence of S. cerevisiae YKL161C PKC-pathway gene. 
The nucleic acid sequence of YKL161C is set forth in SEQ ID NO:13. 

FIG. 20 The amino acid sequence of the protein encoded by 5. cerevisiae 
30 YKL161C PKC-pathway gene. The amino acid sequence of YKL161C is set forth in SEQ 
ID NO: 14. 

FIG. 21 A-B DNA sequence of S. cerevisiae PIR3(YKJL163W) PKC- 
pathway gene. The nucleic acid sequence of PIR3(YKL163W) is set forth in SEQ ID 
35 NO:15. 
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FIG. 22 The amino acid sequence of the protein encoded by S. cerevisiae 
PIR3(YKL163W) PKC-pathway gene. The amino acid sequence of PIR3(YKL1 63 W) is 
set forth in SEQ ID NO: 16. 

5 FIG. 23A-B DNA sequence of S. cerevisiae YPK2(YMR104C) PKC- 

pathway gene. The nucleic acid sequence of YPK2(YMR104C) is set forth in SEQ ID 
NO:17. 

FIG. 24 The amino acid sequence of the protein encoded by S. cerevisiae 
10 YPK2(YMR104C) PKC-pathway gene. The amino acid sequence of YPK2(YMR104C) is 
set forth in SEQ ID NO:18. 

FIG. 25A-B DNA sequence of S. cerevisiae YLR194C PKC-pathway gene. 
The nucleic acid sequence of YLR194C is set forth in SEQ ID NO:19. 

15 

FIG. 26 The amino acid sequence of the protein encoded by S. cerevisiae 
YLR194C PKC-pathway gene. The amino acid sequence of YLR194C is set forth in SEQ 
ID NO:20. 

20 FIG. 27A-B DNA sequence of S. cerevisiae PST1(YDR055C) PKC- 

pathway gene. The nucleic acid sequence of PST1(YDR055C) is set forth in SEQ ID 
NO:21. 

FIG. 28 The amino acid sequence of the protein encoded by S. cerevisiae 
25 PST1(YDR055C) PKC-pathway gene. The amino acid sequence of PST1(YDR055C) is set 
forth in SEQ ID NO:22. 

FIG. 29 DNA sequence of S. cerevisiae KSS1(YGR040W) Invasive Growth 
pathway gene. The nucleic acid sequence of KSS1(YGR040W) is set forth in SEQ ID 
30 NO:23. 

FIG. 30 The amino acid sequence of the protein encoded by S. cerevisiae 
KSS1(YGR040W) Invasive Growth pathway gene. The amino acid sequence of 
KSS1(YGR040W) is set forth in SEQ ID NO:24. 

35 
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FIG. 3 1 DNA sequence of S. cerevisiae PGU1(YJR153W) Invasive Growth 
pathway gene. The nucleic acid sequence of PGU1(YJR153W) is set forth in SEQ ID 
NO:25. 

5 FIG. 32 The amino acid sequence of the protein encoded by S. cerevisiae 

PGU1(YXR153W) Invasive Growth pathway gene. The amino acid sequence of 
PGU1(YJR153W) is set forth in SEQ ID NO:26. 

FIG. 33 DNA sequence of S. cerevisiae YHR042C Invasive Growth 
10 pathway gene. The nucleic acid sequence of YHR042C is set forth in SEQ ID NO:27. 

FIG. 34 The amino acid sequence of the protein encoded by S. cerevisiae 
YHR042C Invasive Growth pathway gene. The amino acid sequence of YHR042C is set 
forthinSEQIDNO:28. 

15 

FIG. 35 DNA sequence of S. cerevisiae SVS1(YPL163C) Invasive Growth 
pathway gene. The nucleic acid sequence of SVS1(YPL163C) is set forth in SEQ ID 
NO:29. 

20 FIG. 36 The amino acid sequence of the protein encoded by S. cerevisiae 

SVS1(YPL163C) Invasive Growth pathway gene. The amino acid sequence of 
SVS1(YPL163C) is set forth in SEQ ID NO:30. 

5. DETAILED DESCRIPTION OF THE INVENTION 

25 The present invention relates, in part, to methods for identifying one or more 

reporter genes and/or target genes for a particular biological pathway of interest. The 
reporter genes of this invention are particularly useful for analyzing the activity of particular 
biological pathways of interest, and may be further used in the design of drugs, drug 
therapies or other biological agents {e.g., insecticides, herbicides, fungicides, antibiotics or 

30 antivirals) to target a particular biological pathway. The present invention also relates to 
methods for identifying one or more target genes for a particular biological pathway of 
interest. Target genes of the invention are useful as specific targets for drug which may be 
designed to enhance, inhibit, or modulate a particular biological pathway. Methods to 
identify gene which modifies the function or structure of a member {e.g., compound or gene 

35 product) of a particular biological pathway are provided. 
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The present invention provides examples of reporter genes and/or target 
genes which have been discovered by the methods of the invention. Specifically, the 
inventors have made the surprising discovery that five S. cerevisiae genes (previously of 
unknown function) form clustered co-regulated sets of genes and are reporters of the 

5 ergosterol-pathway. The methods of the invention are also exemplified in that the inventors 
have specifically discovered six S. cerevisiae reporter genes of the protein kinase C (PKC) 
pathway. Two of these genes are also novel target genes of the PKC pathway and provide 
targets for the development of PKC pathway-specific drugs, drug therapies, or other related 
biological or therapeutical agents. The methods of the invention are further exemplified by 

10 the discovery of four novel reporter genes of the S. cerevisiae Invasive growth pathway. 
One of these genes also serves as a target gene for the Invasive Growth pathway, and may 
be used to develop Invasive Growth pathway-specific drugs, drug therapies, or other related 

biological or therapeutical agents. 

As described herein, the inventors developed a strategy to search the genome 
15 of an organism for cellular constituents which function in a biological pathway of interest. 
Specifically, the inventors have developed a strategy to search the genome of an organism 
for reporter genes and/or target genes of a biological pathway of interest. In one 
embodiment, as described herein, the inventors developed a strategy to search the genome 
of S. cerevisiae for genes which function in a biological pathway of interest. Any pathway 
20 of interest may be examined by the methods of the invention. In specific embodiments, the 
methods of the invention are illustrated by way of the ergosterol-pathway, the PKC 
pathway, and the Invasive-Growth pathway. Additionally, the genome of any species may 
be used in the methods of the invention, so long as the genome of the species is at least 
partially sequenced. In several embodiments of the invention, 20-30%, 30-40%, or 40-60%, 
25 of the sequence of the genome of the species examined by the methods of the invention is 
known. In preferred embodiments of the invention, 60-75%, 75-85%, or 85-90%, of the 
sequence of the genome of the species examined by the methods of the invention is known. 
In highly preferred embodiments of the invention, 90-95%, 95-98%, or 98% or more of the 
sequence of the genome of the species examined by the methods of the invention is known. 
30 In a most preferred embodiment of the invention, the entire sequence of the genome of the 
species examined by the methods of the invention is known. 

The methods described herein relate to DNA microarray technology as 
described in Section 5.1 et seg., and in U.S. Patent serial No. 09/179,569, filed October 27, 
1998 now pending, and U.S. Patent serial No. 09/220,275 filed December 23, 1998, now 
35 pending, and U.S. Patent serial No. 09/220,142, filed December 23, 1998 now pending, 
which are incorporated herein by reference in their entirety. The reporter genes and target 
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genes of the invention constitute very useful tools for probing the function, regulation, 
activation, and inhibition of their corresponding pathways. Biochemical and genetic 
analysis of pathways involving the reporters and particularly the targets of the invention can 
be expected to lead to the discovery of new drug targets, therapeutic proteins, diagnostics, 
5 and prognostics useful in the treatment of diseases and clinical problems, for example, those 
associated with the activation or inactivation of a particular pathway. 

Methods for biochemical analysis of pathways of the invention are provided. 
Such methods may yield results of importance to human disease. For example, systematic 
identification of participants in the ergosterol-pathway, or components regulating synthesis 
10 of ergosterol provide leads to the identification of drug targets, therapeutic proteins, 
diagnostics, or prognostics useful for treatment or management of fungal infections. 

The invention is illustrated by way of examples set forth in Section 6 below 
which disclose, inter alia, the characterization of reporters and targets of the invention 
including reporter genes of the S. cerevisiae ergosterol-pathway, PKC-pathway, and 
1 5 Invasive Growth pathway using DNA microarray technology. 

For clarity of disclosure, and not by way of limitation, the detailed 
description of the invention is divided into the subsections which follow. 

5.1. CHARACTERIZATION PROCEDURES 

20 The present invention relates, in part, to methods for identifying one or more 

reporter genes for a particular biological pathway of interest. As used herein, a reporter 
gene refers to any gene for which a change in it expression and/or activity of its encoded 
RNA or protein is indicative of a changes in the activity of a particular biological pathway 
of pathway of interest. Thus, the reporter genes of this invention are useful for analyzing 
.25 the activity of particular biological pathways of interest, e.g., in the design of drugs, drug 
therapies or other biological agents (e.g., insecticides, herbicides, fungicides, antibiotics or 
antivirals) to target particular biological pathways. 

The present invention also relates, in part, to methods for identifying one or 
more target genes for a particular biological pathway of interest. As used herein, a target 
30 gene refers to any gene whose expression and/or activity is necessary for normal activity or 
function of the pathway. Thus, the target genes of this invention are useful as targets for 
drugs designed to enhance, inhibit, or modulate a particular biological pathway. Thus, the 
target genes of this invention are useful targets for design of drugs, drug therapies or other 
biological agents (e.g., insecticides, herbicides, fungicides, antibiotics or antivirals) directed 
35 to a particular biological pathway. 
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Biological pathways, as used herein, refer to collections of cellular 
constituents {e.g., protein abundances or activities, protein phosphorylation, RNA species 
abundances such as mRNA species abundances, or DNA species abundances such as 
abundances of cDNA species derived from mRNA -- as used herein the term "cellular 
5 constituent" is not intended to refer to known subcellular organelles such as mitochondria, 
lysozomes, etc.) which are related in that each cellular constituent in the collection is 
influenced according to some biological mechanism by one or more other cellular 
constituents in the collection. Biological pathways of the present invention therefore 
include well-known biochemical synthetic pathways including, for example, the yeast 
10 ergosterol pathway, in which, e.g., molecules are broken down to provide cellular energy 
stores or in which protein or nucleic acid precursors or other cellular compounds are 
synthesized. Signaling and control pathways typically include primary or intermediate 
signaling molecules, as well as proteins participating in the signal or control cascades 
usually characterizing these pathways. In signaling pathways, binding of a signal molecule 
15 to areceptor usually directly influences the abundances of intermediate signaling molecules 
and indirectly influences, e.g., the degree of phosphorylation (or other modification) of 
pathway proteins. Both of these effects in turn influence activities of cellular proteins that 
are key effectors of the cellular processes initiated by the signal, for example, by affecting 
the transcriptional state of the cell. Control pathways, such as those controlling the timing 
20 and occurrence of the cell cycle, are similar. Here, multiple, often ongoing, cellular events 
are temporally coordinated, often with feedback control, to achieve a consistent outcome, 
such as cell division with chromosome segregation. This coordination is a consequence of 
functioning of the pathway, often mediated by mutual influences of proteins on each other's 
degree of phosphorylation or other modification. Biological pathways of the invention 
25 also include, but are not limited to: signaling pathways, control pathways, mating pathways, 
cell cycle pathways, cell division pathways, cell repair pathways, small molecule synthesis 
pathways, protein synthesis pathways, DNA synthesis pathways, RNA synthesis pathways, 
DNA repair pathways, stress-response pathways, cytoskeletal pathways, steroid pathways, 
receptor-mediated signal transduction pathways, transcriptional pathways, translation^ 
30 pathways, immune response pathways, heat-shock pathways, motility pathways, secretion 
pathways, endocytotic pathways, protein sorting pathways, phagocytic pathways, 
photosynthetic pathways, excretion pathways, electrical response pathways, pressure- 
response pathways, protein modification pathways, small-molecule response pathways, 
toxic-molecule response pathway transformation pathways, etc. Specifically, the invention 
35 herein is illustrated in subsection 6, by way of reporter genes which have been discovered 
for the ergosterol-pathway and the protein kinase C pathway. Other, well known control 
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pathways seek to maintain optimal levels of cellular metabolites in the face of a fluctuating 
environment. Further examples of cellular pathways operating according to understood 
mechanisms are well known and will therefore be readily apparent to those of skill in the 
art. 

5 The methods of the invention may be used to identify reporter genes or target 

genes in any cell type from any species of organism. In one preferred embodiment, the 
methods of the invention are used to identify reporter genes and target genes in S. 
cerevisiae. However, in other preferred embodiments the methods of the invention are used 
to identify reporter genes and/or target genes in other cell types including prokaryotic and 
10 eukaryotic, vertebrate and invertebrate, and in other species, including plant, animal, insect, 
worm, funus, yeast, fish, and bird species. In one preferred embodiment the methods of the 
invention identify one or more reporter genes and or target genes in a mammalian species of 
interest (e.g. mouse, rat, rabbit, dog, cat, horse, sheep, pig, cattle, etc.). In one particularly 
preferred embodiment, the methods of the invention identify one or more reporter genes 
15 and/or target genes in a human. In another preferred embodiment the methods of the 
invention identify one or more reporter genes and/or target genes in a species which is 
amenable to genetic manipulation of the entire organism (e.g., fly or worm). In other 
embodiments, the methods of the invention identify one or more reporter genes and/or 
target genes in other species described herein. 
20 The reporter genes of the present invention comprise genes whose genetic 

transcripts (i.e., mRNA transcripts or cDNA molecules produced from mRNA transcripts) 
"co-vary" and/or are "co-regulated." Specifically, the reporter genes of the invention 
increase or decrease the abundance of their transcripts under some set of conditions which is 
associated with a particular biological pathway of interest and/or with other genes which are 
25 associated with the particular biological pathway of interest. 

The target genes of the present invention comprise genes whose genetic 
transcripts (i.e., mRNA transcripts or cDNA molecules produced from mRNA transcripts) 
"co-vary" and/or are "co-regulated." Specifically, the target genes of the invention increase 
or decrease the abundance of their transcripts under some set of conditions which is 
30 associated with a particular biological pathway of interest and/or with other genes which are 
associated with the particular biological pathway of interest. Further, target genes of the 
invention are those genes of a geneset who expression and/or activity are necessary for the 
activity or function of the pathway. Methods for identifying such co-varying genes are 
described generally and in detail in U.S. patent application serial no. 09/179,569, filed 
35 October 27, 1998, now pending, in U.S. patent application serial no. 09/220,275, filed 

December 23, 1998, now pending, and in U.S. patent application serial no. 09/220,142 filed 
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December 23, 1998, now pending each of which are incorporated herein by reference in 
their entirety. These methods are described below as they particularly pertain to identifying 
reporter genes. Specifically, subsection 5.1.1 describes methods such as cluster analysis 
which may be used to identify covarying genesets. Such cluster analysis methods are 
5 preferably applied to measurements of the "transcriptional state" of a cell; Le., to 
measurements of abundances of genetic transcripts (mRNA or cDNA) of a cell. Most 
preferably, the transcriptional state of a cell is measured using polynucleotide microarrays. 
Accordingly, subsection 5.1.2-5.1.5 describe methods of measuring the transcriptional state 
using microarrays, including methods of construction microarrays, methods of hybridizing 
10 polynucleotide samples (e.g., from cells) to microarrays, and signal detection on 
microarrays. Subsection 5.1 .6 describes other, less preferred methods by which the 
transcriptional state of a cell may be measured. 

Although for simplicity the disclosure often makes reference to single cells 
(e.g., "RNA is isolated from a cell exposed to a particular drug"), it will be understood by 
15 those of skill in the art that more often any particular step of the invention will be carried 
out using a plurality of genetically similar cells, e.g., from a cultured cell line. Such similar 
cells are referred to herein as a "cell type." Such cells may be either from naturally single 
celled organisms (e.g., E. coli or S. cerevisiae) or derived from multi-cellular higher 
organisms (e.g. from plant or animal organisms, including mammalian organisms such as a 
20 human cell line). 

5.1.1. CLUSTER ANALYSIS 

In a preferred aspect of the invention, the reporter genes and/or target genes 
may be identified by methods using cluster analysis. The cluster analysis technique is based 
25 in the principal that in general cellular constituents (e.g., gene transcripts) will respond in a 
coordinated fashion in response to a particular stimulus, treatment, or biological state. 
Therefore, subsets of cellular constituents will typically change together, e.g., by increasing 
or decreasing their abundances and/or activities, under some set of conditions which 
preferably include the conditions or perturbations of interest to a user of the present 
30 invention (e.g., treatment with antifungal compounds). 

Further, the abundances and/or activities of individual cellular constituents 
are not all regulated independently. Rather, individual cellular constituents from a cell will 
typically share one or more regulatory elements with other cellular constituents from the 
same cell. For example, and not by way of limitation, in embodiments where the cellular 
35 constituents comprise genetic transcripts, the rates of transcription are generally regulated 
by regulator sequence patterns, Le., transcription factor binding sites. Typically, several 
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genes within a cell may share one or more transcription factor binding sites. Such cellular 
constituents are therefore said to be "co-regulated," and comprise co-regulated cellular 
constituent sets or "co-regulated sets." For example, and not by way of limitation, genes 
tend to increase or decrease their rates of transcription together when they possess similar 
5 transcription factor binding sites. Such a mechanism accounts for the coordinated responses 
of genes to particular signaling inputs. For example, see Madhani and Fink, 1998, 
Transactions in Genetics 74:151-155; and Amone and Davidson, 1997, Development 
Z24:1851-1864. For instance, individual genes which synthesize different components of a 
necessary protein or cellular structure are generally co-regulated. Also, duplicated genes 
10 (see, e.g., Wagner, 1996, Biol. Cybern. 74:557-567) are co-regulated to the extent that 

genetic mutations have not led to functional divergence in their regulatory regions. Further, 
because genetic regulatory sequences are modular (see, e.g., Yuh et al., 1998, Science 
279:1896-1902), the more regulatory "modules" two genes have in common, the greater the 
variety of conditions under which they will be co-regulated in their transcription rates,. 
15 Physical separation between modules along the chromosome is also an important 
determinant since co-activators are often involved. 

In particularly preferred embodiments of the present invention, the cellular 
constituents in a biological profile comprise genetic transcripts such as mRNA abundances, 
or abundances of cDNA molecules produced from mRNA transcripts. In such 
20 embodiments, the co-regulated sets comprise genes which are generally co-regulated to 
some extent. Such co-regulated sets are referred to herein as "genesets." Thus, in 
particularly preferred embodiments of the present invention, the co-regulated cellular 
constituent sets are genesets. In one specific embodiment of the present invention, the 
geneset comprises genes of the ergosterol-pathway. In another specific embodiment of the 
25 present invention, the geneset comprises genes of the PKC-pathway. In another specific 
embodiment of the present invention, the geneset comprises genes of the Invasive Growth 
pathway. 

In a specific embodiment of the invention, when the genome of the organism 
of interest has been sequenced, the number of ORF's can be determined and mRNA coding 

30 regions identified by analysis of the DNA sequence. For example, the genome of 
Saccharomyces cerevisiae has been completely sequenced, and is reported to have 
approximately 6275 ORFs longer than 99 amino acids. Analysis of the ORFs indicates that 
there are 5885 ORFs that are likely to encode protein products (Goffeau et al, 1996, 
Science 274:546-567). However, many of these genes do not have a known function, nor 

35 are they associated with a known function. The invention herein provides methods for 

assigning function to such ORFs, by the methods of the invention including cluster analysis. 
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5.2. PATHWAY RESPONSE PROFILES & PERTURBATIONS 

In one aspect of the invention, gene expression change in response to a large 
number of perturbations is used to construct a clustering tree for the purpose of denning 
genesets. Preferably, the perturbations should target different pathways. In order to 
5 measure expression responses to the pathway perturbation, biological samples are subjected 
to perturbations to pathways of interest. The samples exposed to the perturbation and 
samples not exposed to the perturbation are used to construct transcript arrays, which are 
measured to find the mRNAs with modified expression and the degree of modification due 
to exposure to the perturbation. Thereby, the perturbation-response profile is obtained. 
10 FIG. 1 illustrates an overview of the method by which reporter genes and/or 

target genes are identified. The methods analyze a plurality of "response profiles" which 
are preferably obtained or provided (FIG. 1, 101) from measurements of the transcriptional 
or translational state of a cell {e.g., measurements of mRNA abundances or of abundances 
of cDNA derived from mRNA) under a variety of different experimental conditions. More 
15 precisely, the transcriptional or translational state of the cell in response to a plurality of 
different perturbations to the cell is measured. In preferred embodiments, the 
transcriptional or translational state of the ceil is measured in response to at least ten 
different perturbations to the cell, more preferably in response to at least 100 perturbations, 
still more preferably in response to at least 400 perturbations, and yet more preferably in 
20 response to over 1,000 different perturbations. 

Perturbations to the cell may comprise, for example, exposure to one or more 
drugs at one or more levels {i.e., at one or more concentrations of the drug). Perturbations 
may also comprise genetic alterations to the cell such as genetic "knockouts" wherein one or 
more genes are deleted and/or no longer expressed in the cell. Other possible genetic 
25 alterations include regulated expression of one or more genes in the cell, wherein the level 
of expression of the one or more genes is altered {e.g., increased or decreased) in a 
controlled manner, e.g., by means of a titratable promoter system. Such perturbations, as 
well as others which may be used to identify reporter genes and/or target genes, are 
described, in detail in subsection 5.3 below. 
30 Perturbations to the cell may further comprise changes in one or more 

aspects of the physical environment of the cell. Such environmental changes can include, 
for example, changes in the temperature {e.g., a temperature elevation of 10 °C) or exposure 
to moderate doses of radiation. Other exemplary environmental changes include changes in 
the nutritional environment, such as the presence or absence of particular sugars, amino 
35 acids, and so forth. 
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In preferred embodiments, some of the perturbations are perturbations which 
are known to affect a particular biological pathway of interest; i.e., the biological pathway 
for which one or more reporter genes and/or target genes are to be identified. In some 
preferred embodiments, about 5-50%, preferably about 10-30%, more preferably about 10- 
5 25%, still more preferably about 10-20%, and most preferably about 10-15% of the 

perturbations are perturbations which are known to affect a particular biological pathway of 
interest. 

At least two genes (i.e. at least two mRNA or cDNA species) are measured 
in response to each perturbation. Preferably, at least 10 genes are measured in response to 

10 each perturbation, more preferably more than 100 genes, still more preferably more than 
1,000 genes, and most preferably more than 10,000 genes. Preferably mRNA or cDNA 
abundances are measured for more that 10% of the genes of the cell being analyzed. More 
preferably, mRNA or cDNA abundances are measured for more than 25%, more than 50%,, 
more than 75%, more than 80%, more than 90%, more than 95%, or more than 99% of the 

15 genes of the cell being analyzed. Most preferably, mRNA or cDNA abundances are 

measured for all of the genes of the cell being analyzed. In preferred embodiment, some of 
the genes measured in response to each perturbation are genes which are known to be 
involved in a particular biological pathway of interest, i.e., the biological pathway for which 
one or more reporeter genes are to be identified. In some preferred embodiments, about 5- 

20 50%, preferably about 10-30%, more preferably about 10-25%, still more preferably about 
10-20%, and most preferably about 10-15% of the genes measured in response to each 
perturbation are genes which are known to be involved in a particular biological pathway of 
interest. 

In preferred embodiments, the response profiles analyzed by the methods of 
25 the invention are optionally screened, before the analysis, to select only those cellular 
constituents that have a significant response in some fraction of the profiles (FIG. 1, 102). 
In particular, although the profiles may cover up to ~ 10 5 genes, in most perturbations a 
large part or evan a majority of these genes will not change significantly, or the changes 
may be small and dominated by experimental error. Accordingly, in most embodiments, it 
30 will be unhelpful and cumbersome to use these genes in to identify reporter genes according 
to the methods of this invention. Thus, they are preferably deleted from all profiles. 

In certain embodiment, only genes that have a response greater than or equal 
to two standard errors in more than N profiles are selected for subsequent analysis, where N 
may be one or more and is preferably selected by the user. Preferably, N will tend to be 
35 larger for larger sets of response profiles. For example, in one preferred embodiment N 
may be approximately equal to the square root of the number of response profiles analyzed. 
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The invention provides a method for determining whether a molecule affects 
the function or activity of an ergosterol pathway in a cell comprising:(a) contacting the cell 
with, or recombinantly expressing within a cell the molecule; and (b) determining whether 
the expression of one or more of the genes selected from the group consisting of: YHR039C 
5 (as depicted in FIG.2, as set forth in SEQ ID NO:1),YLW100W (as depicted in FIG.4, as set 
forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ID NO:5), 
YGR131 W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as depicted 
in FIG.10, as set forth in SEQ ID NO:9) is changed relative to said expression in the 

absence of the molecule. 
1Q The invention provides a method for determining whether a molecule affects 

the function or activity of an PKC pathway in a cell comprising:(a) contacting the cell with, 
or recombinantly expressing within a cell the molecule; and (b) determining whether the 
expression of one or more of the genes selected from the group consisting of: 
SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:ll), YKR161C 
15 (as depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted 
in FIG.21A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted in 
FIG.23A-B, as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set 
forth in SEQ ID NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in 
SEQ ID NO:21) is changed relative to said expression in the absence of the molecule. 
2Q " " The invention provides a method for determining whether a molecule affects 

the function or activity of an S. cerevisiae Invasive Growth pathway in a cell comprising: 
(a) contacting the cell with, or recombinantly expressing within a cell the molecule; and (b) 
deterniining whether the expression of one or more of the genes selected from the group 
consisting of: KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), 
25 PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 
depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as depicted in 
FIG35, as set forth in SEQ ID NO:29), is changed relative to said expression in the absence 
of the molecule. 

30 5.2.1. CLUSTER ANALYSIS ALGORITHMS 

Response profiles having been thus obtained and, optionally, screened to 
selected genes with significant responses, the genes and/or the individual response profiles 
are each grouped according to their similarities (FIG. 1, 103 and 104). In particular, the 
genes being analyzed according to the methods of the present invention are grouped or re- 

35 ordered into co-varying sets (FIG. 1, 103). Likewise, a similar grouping may be optionally 
performed to group the response profiles according to their similarity (FIG. 1, 104). The 
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steps of grouping the genes and grouping the response profiles may be performed in any 
order; i.e., the genes may be grouped first Preferably the genes and/or response profiles are 
each grouped by means of a pattern recognition procedure or algorithm, most preferably by 
means of a clustering procedure or algorithm. Such algorithms are well known to those of 
5 skill in the art, and are reviewed, e.g., by Fukunaga, 1990, Statistical Pattern Recognition, 
2nd Ed., London: Academic Press; Everitt, 1974, Cluster Analysis, London: Heinemaim 
Educ. Books; Hartigan, 1975, Clusterin g Algorithms, New York: Wiley; Sneath & Sokal, 
1973, Numerical Taxonomy, Freeman; and Anderberg, 1973, Cluster Analysis for 
Applications, New York: Academic Press, each of which is incorporated herein, by 
10 reference, in its entirety. Such algorithms include, for example, hierarchical agglomerative 
clustering algorithms, the "k-means" algorithm of Hartigan (supra), and model-based 
clustering algorithms such as hclust by MathSoft, Inc. In one preferred embodiment, the 
clustering analysis of the present invention is done using a hierarchical clustering algorithm, 
most preferably the hclust algorithm (see, e.g., % hclusf routine from the software package S- 
15 Plus, MathSoft, Inc., Cambridge MA). 

The clustering algorithms used in the present invention operate on tables of 
data containing gene expression measurements such as those described above. Specifically, 
the data tables analyzed by the clustering methods of the present invention comprise an m x 
k array or matrix wherein m is the total number of experimental conditions or perturbations 
20 and k is the number of genes measured and/or analyzed. 

The clustering algorithms of the invention analyze such arrays or matrices to 
determine dissimilarities between the individual genes or between individual response 
profiles. For example, the dissimilarity between two genes i andy may be expressed 
mathematically as the "distance" I Q . A variety of distance metrics which are known to those 
25 skilled in the art which may be used in the clustering algorithms of the invention. For 

example, in one embodiment, the euclidian distance is determined according to the formula 

1/2 
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35 



wherein vf n) and yf* are the response of genes i andy respectively to the perturbation n. In 
other embodiment, the Euclidian distance in Equation 1 above is squared to place 
progressively greater weight on cellular constituents that are further apart. In alternative 
embodiments, the distance measure I y is the Manhattan distance provided by 
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In certain other embodiments the response profile data is categorical (i.e., each the 
measured changes in gene expression is represented as either 1 or 0 in each profile), and the 
distance measure is preferably a percent disagreement defined by: 

(No. of v<"> ; <>) (3) 

5 7 '-;~ n 

wherein N is the total number of response profiles. 

In particularly preferred embodiments, the distance is defined as I tJ ■= 1 - r ip 
wherein r g is the "correlation coefficient" or normalized "dot product" between the genes i 
10 and./. In particular,^, is preferably defined by 

r -ZlIL- (4) 
,J kllv,! 

wherein the dot product v. v, is provided by the expression 

15 v.-v^z^-r) (5) 

n 

and |v,| = (v.-v,)"; |v,| = {v-vf'. 

In still other embodiments, the distance measure may be the Chebychev 
20 distance, the power distance, or the percent disagreement; all of which are well known in 
the art. Most preferably the distance measure is appropriate to the biological questions 
being asked, i.e., for identifying co-regulated and/or co-varying genesets and, in particular, 
for identifying reporter genes and/or target genes within such genesets. Thus, in another 
particularly preferred embodiment, the correlation coefficient comprises a weighted dot 
25 product between genes i andy defined by the equation 
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wherein o?> and oj n > are the standard errors associated with the measurement of genes i and 

j respectively in experiment n. 
35 ' " The correlation coefficients of Equations 4 and 6 are bonded between values 

of +1, which indicates that the two genes are perfectly correlated and essentially identical in 
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their response to perturbations, and -1, which indicates that the two genes are "anti- 
correlated" or "anti-sense" {i.e., opposites). Thus, these correlation coefficients are 
particularly preferable in embodiments of the invention where the responses all have the 
same sign. However, in other embodiments it is preferable to identify genesets which are 

5 co-regulated or involved in the same biological response or pathways but which comprise 
similar and anti-correlated responses. In such embodiments, it is preferable to use the 
absolute value of Equation 4 or 6, i.e., \r v \ 9 as the correlation coefficient. 

In still other embodiments, the relationships between co-regulated and/or co- 
varying genesets may be even more complex, such as in instances wherein multiple 

10 biological pathways (e.g., signaling pathways) converge on the same cellular constituent to 
produce different outcomes. In such embodiments, it is preferable to use a correlation 
coefficient r = r J ckanse) which is capable of identifying co- varying and/or co-regulated 
genes irrespective of the sign. The correlation specified by Equation 7 below is particularly 
useful in such embodiments. 
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The cluster analysis methods may also be applied "two-dimensionally" in 
order to perform two-dimensional (2D) clustering analysis on the response profiles. 
Specifically, the clustering methods of the invention may be used both to cluster genes in 

^ 5 co-varying genesets, and cluster response profiles into sets of similar response profiles, i.e., 
perturbations that produce similar transcriptional responses. Such dual clustering is referred 
to herein as "two-dimensional clustering" or " two-dimensional cluster analysis". Distance 
metrics will be apparent to those skilled in the art for clustering the response profiles which 
are similar to those described above for clustering of genes. For example, one skilled in the 

30 art will readily appreciate that a suitable correlation coefficient / m " ; for evaluating two 
response profiles m and n may be provided by a formula analogous to Equation 4 above: 



v (n) .v (m) 
|v (n) ||v (m) | 



(8) 
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wherein the dot product yf n) V m; is defined in a manner analogous to Equation 5 above, by 
the formula 
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where vf n) and vf m> are the response of gene i to the perturbations n and m, respectively. 

Generally, the clustering algorithms used in the methods of the invention 
5 also use one or more linkage rules to group cellular constituents into one or more sets or 
"clusters." For example, single linkage or the nearest neighbor method determines the 
distance between the two closest objects (i.e., between the two closest genes) in a data table. 
By contrast, complete linkage methods determine the greatest distance between any two 
objects (i.e„ cellular constituents) in different clusters or sets. The unweighted pair-group 
10 average evaluates the "distance" between two clusters or sets by determining the average 
distance between all pairs of objects (i.e., genes) in the two clusters. Alternatively, the 
weighted pair-group average evaluates the distance between two clusters or sets by 
determining the weighted average distance between all pairs of objects in the two clusters, 
wherein the weighing factor is proportional to the size of the respective clusters. Other 
15 linkage rules, such as the unweighted and weighted pair-group centroid and Ward's method, 
are also useful for certain embodiments of the present invention (see, e.g., Ward, 1963, J. 
Am. Stat. Assn. 55:236; Hartigan, 1975, Clustering Algorithms, New York: Wiley; each of 
which is incorporated herein by reference in its entirety). 

Once a clustering algorithm has grouped the genes from the data table into 
20 sets or cluster (i.e., into genesets) by application of linkage rules such as those described 
supra, a clustering "tree" may be generated to illustrate the genesets so determined. FIG. 14 
illustrates an exemplary clustering tree generated by the hclust clustering algorithm upon 
analysis of a 34x185 table of response profile data using the distance metric I y = 1 - /> The 
measured response data comprise the logarithm to the base 10 of the ratio between 
25 abundances of each transcript in the pair conditions (i.e., perturbation and no perturbation) 

comprising each experiment n. 

Genesets may be readily defined based on the branchings of a clustering tree 
or diagram such as the one illustrated in FIG.14. In particular, genesets may be defined 
based on the many smaller branchings of a clustering tree, or, optionally, larger genesets 

30 may be denned corresponding to the larger branches of a clustering tree. Preferably, the 
choice of branching level at which genesets are denned matches the number of distinct 
response pathways expected. In embodiments wherein little or no information is available 
to indicate the number of pathways, the genesets should be defined according to the 
branching level wherein the branches of the clustering tree are "truly distinct." 

35 " „ Truly ^stinct," as used herein, is defined, e.g. , by a minimum distance 

value between the individual branches. Typically, the distance values between truly distinct 
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genesets are in the range of 0.2 to 0.4, where a distance of zero corresponds to perfect 
correlation and a distance of unity corresponds to no correlation. However, distances 
between truly distinct genesets may be larger in certain embodiments, e.g., wherein there is 
poorer quality data or fewer experiments in the response profile data. Alternatively, in other 
5 embodiments, e.g. , having better quality data or more experiments in the profile dataset, the 
distance between truly distinct genesets may be less than 0.2. 

5.2.2. REPORTER GENES 

Once genesets have been identified, e.g., by means of the above-described 
10 cluster analysis methods, reporter genes may be readily identified by anyone who is 

reasonably skilled in the art. In particular, any gene which clusters to a geneset associated 
with a particular biological effect or biological pathway is potentially useful as a reporter 
gene for that biological effect or biological pathway. Genesets associated with a particular 
biological effect or pathway can be readily identified, e.g., by identifying other genes in the 
1 5 geneset which are associated with the particular biological effect or pathway. Further, the 
members of a geneset associated with a particular biological effect or pathway will tend to 
be activated (or inhibited) by perturbations (i.e., in response profiles) which target a 
particular biological effect or pathway. Thus, geneset associated with a particular biological 
effect or pathway can also be identified by identifying genesets that respond (i.e., whose 
20 members are activated or inhibited) to perturbations that target the particular biological 
effect or pathway. 

Preferably, the reporter genes of the invention also have one or more of the 
following characteristics. First, the reporter genes of the invention should be highly specific 
for the biological effect or pathway of interest. In particular, the reporter genes of the 

25 present invention should cluster specifically to genesets associated with the biological effect 
or pathway of interest, and their expression should not be altered, or, less preferably, should 
only be slightly altered, by perturbations which target other biological effects or pathways. 

Second, the reporter genes of the invention preferably have a high level of 
induction. In particular, the reporter genes of the invention are preferably expressed at high 

30 levels, and their level of expression changes significantly in response to perturbations of the 
biological effect or pathway of interest. For example, in one embodiment, expression of a 
reporter genes of the invention changes at least two fold in response to a perturbation to the 
biological effect or pathway of interest. In a more preferred embodiment, expression of a 
reporter gene of the invention changes by at least ten fold in response to a perturbation to 

35 the biological effect or pathway of interest. Most preferably, a reporter gene of the 
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invention will change by a factor of one hundred or more in response to a perturbation to 
the biological effect or pathway of interest. 

The reporter genes of the invention are also preferably sensitive to perturbations to 
the biological effect or pathway of interest. In particular, preferably the reporter genes of 
5 the invention are perturbed (i.e., their expression is up-regulated or down-regulated) at 

measurable levels in response to only slight perturbations to the biological effect or pathway 
of interest, such as in response to low doses of a drug which targets the biological effect or 
pathway of interest. More preferably, the reporter genes of the invention are more sensitive 
to perturbations to the biological effect or pathway of interest than are other genes in the 
10 geneset for that biological effect or pathway. 

In most embodiments, the reporter genes of the invention are preferably general 
reporters for the entire biological effect or pathway of interest. More specifically, the 
reporter genes preferably cluster, and therefore respond, to perturbations targeted to the 
entire biological effect or pathway of interest and not just to particular portions thereof (e.g., 
15 to early or late steps of a particular biological pathway). However, one skill of the art can 
readily appreciate that in certain embodiments it will be useful to identify reporter genes for 
a particular part of a biological effect or pathway of interest. Accordingly, in such 
embodiments, the reporter genes identified are preferably specific for those particular 
portions of the biological effect or pathway that are of interest. 
20 Finally, in certain embodiments, the reporter genes of the invention are genes which 

kinetically induce quickly, and therefore respond quickly to perturbations of the biological 
effect or pathway of interest. For example, in most embodiments, changes in the reporter 
genes of the invention will preferably reach steady state within about eight hours after a 
perturbation (e.g. , after exposure to a drug which targets a biological effect or pathway of 
25 interest). More preferably, a reporter gene of the invention induces within about six hours 
after a perturbation. In other preferred embodiments, a reporter gene of the invention 
induces within about 2 hours, within about ninety minutes, within about sixty minutes, 
within about thirty minutes, within about ten minutes, or within about seven minutes after a 
perturbation. 

Other embodiments of the invention provides methods for using 
combinations of genes to construct a more specific reporter for a particular biological 
pathway in which it is desired to increase the specificity of a particular pathway reporter 
system. In this embodiment, more than one gene, or cellular constituent in the same 
biological pathway is used as a reporter for that pathway. By way of example, a reporter 
gene of the Invasive Growth pathway such as PGU1, and a second gene in the same 
pathway such as SVS1, may be detected simultaneously as a reporter for the Invasive 
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Growth pathway. Such co-detection can serve to increase the sensitivity of a reporter of a 
particular biological pathway. Alternatively, for example, the promoter from a first gene of 
the Invasive Growth pathway, such as PGU1 may be fused to a marker such as GFP (green 
fluorescent protein), and a the promoter from a second gene in the same pathway such as 
5 SVS1, could be fused to BFP (blue fluorescent protein). Detection of the both proteins 
makers simultaneuosly can thus provides a higher sensitivity. Thus in this embodiment, the 
reporter of the pathway is a combination of two or more genes. In other embodiment of the 
invention, a 2-3, 3-5, 5-10 genes are detected simultaneously as a reporter system for a 
particular biological pathway. 
10 The invention provides a method of identifying a reporter gene for a 

particular biological pathway in a cell comprising identifying a gene which clusters to a 
geneset associated with the biological pathway, wherein said gene which clusters to the 
geneset associated with the particular biological pathway is a reporter gene. 

In one embodiment the reporter gene is a reporter for the 
15 ergosterol-pathway, and the reporter gene is selected from the group consisting of: 

YHR039C (as depicted in FIG.2, as set forth in SEQ ID NO:1),YLW100W (as depicted in 
FIG.4, as set forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ID 
NO:5), YGR131 W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as 
depicted in FIG. 10, as set forth in SEQ ID NO:9). 
20 In another embodiment the reporter gene is a reporter for the PKC-pathway, 

and the reporter gene is selected from the group consisting of: SLT2(YHR030C) (as 
depicted in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C (as depicted in 
FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in FIG.2 1A-B, 
as set forth in SEQ ID NO: 15), YPK2(YMR104C) (as depicted in FIG.23A-B, as set forth 
25 in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID NO:19), 
and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21). 

In another embodiment the reporter gene is a reporter for the Invasive 
Growth pathway, and the reporter gene selected from the group consisting of 
KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), 
30 PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 
depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as depicted in 
FIG.35, as set forth in SEQ ID NO:29). 

5.2.3. TARGET GENES 
35 Once genesets have been identified, e.g., by means of the above-described 

cluster analysis methods, target genes may be readily identified in the following manner. 
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Any gene which clusters to a geneset associated with a particular biological effect or 
biological pathway may be considered a potential target gene and may further be tested to 
examine whether the expression and/or activity of the gene is necessary for normal activity 
or function of the pathway. A gene whose expression and/or activity is necessary for 
5 normal activity or function of the pathway is therefore useful as a target for drugs designed 
to enhance, inhibit, or modulate the particular biological pathway. Any method known in 
the art may be used to examine the necessity of a particular gene to the activity or function 
of an associated biological pathway. For example, by way of illustration, potential target 
gene, such as a potential ergosterol-pathway target gene may be validated as a target gene in 

10 the following manner. 

Once a potential target gene has been identified {e.g., by clustering analysis 
as described herein), the gene may be examined by mutational analysis to determine 
whether the gene is essential. Methods for mutational analysis are commonly known in the 
art. If the potential ergosterol-pathway target gene is essential for normal growth of the 
15 yeast, such a gene is a target gene. Such a gene would constitute a preferred target for 
antifungal or fungicidal drug development. Further, additional genetic analysis may be 
performed in order to construct and characterize a conditional allele of the gene in order to 
determine the effects of gene product inhibition, particularly whether the cell dies upon 
shifting to the restrictive condition, or whether the cell can recover upon shifting back to the 
20 permissive condition. Any method known in the art may be used to construct a conditional 
allele, for example, a temperature sensitive allele, or promoter replacement may be 
performed so that expression may be regulated. The construction of a conditional allele 
also allows for the determination of the terminal phenotype, contributing to an 
understanding of the function of the gene. If, for example, the potential ergosterol-pathway 
25 gene is determined not to be essential in S. cerevisiae, or if a severe growth defect does not 
result from deletion of the gene, the gene is not a preferred target gene for the development 
of a pathway-specific drug such as an antifungal agent. 

Another way in which a potential target gene may be validated is by 
searching the sequence database for a homolog genes. For example, in the case of an S. 
30 cerevisiae target gene, a database from the yeast Candida may serve as a database for which 
to compare sequence. Alternatively, a search of all sequence databases may be performed 
to uncover sequence motifs that will reveal potential activities of the gene. Specifically, by 
way of example computer programs for determining homology include but are not limited 
to TBLASTN, B LAS TP, FASTA, TFASTA, and CLUSTALW (Pearson and Lipman, 1988, 
35 Proc. Natl. Acad. Sci. USA 85(8):2444-8; Altschul et aL, 1990, J. Mol. Biol. 215(3):403-10; 
Thompson, et al., 1994, Nucleic Acids Res. 22(22):4673-80; Higgins, et aL, 1996, Methods 
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Enzymol 266:383-402; Altschul, et al., 1990, J. MoL Biol. 215(3):403-10). If, for example, 
a homolog of the S. cerevisiae target gene is found in Candida, the Candida gene may be 
analyzed as above to determine whether the homolog is essential in Candida, and would 
constitute a validated target. 
5 The invention provides a method of identifying a target gene for a particular 

biological pathway in a cell comprising identifying a gene which clusters to a geneset 
associated with the particular biological pathway, wherein said gene which clusters to a 
geneset associated with the particular biological pathway and is identified as a gene which 
is necessary for normal function of said particular biological pathway. 

10 

5.3, PERTURBATION METHODS 

Methods for perturbation of biological pathways at various levels of a cell 
are increasingly widely known and applied in the art. Any such methods that are capable of 
specifically targeting and controllably modifying (e.g., either by a graded increase or 
1 5 activation or by a graded decrease or inhibition) specific cellular constituents (e.g. , gene 
expression, RNA concentrations, protein abundances, protein activities, or so forth) can be 
employed in performing pathway perturbations. Controllable modifications of cellular 
constituents consequentially controllably perturb pathways originating at the modified 
cellular constituents. Such pathways originating at specific cellular constituents are 
20 preferably employed to represent drug action in this invention. Preferable modification 
methods are capable of individually targeting each of a plurality of cellular constituents and 
most preferably a substantial fraction of such cellular constituents. 

The following methods are exemplary of those that can be used to modify 
cellular constituents and thereby to produce pathway perturbations which generate the 
25 pathway responses used in the steps of the methods of this invention as previously 
described. This invention is adaptable to other methods for making controllable 
perturbations to pathways, and especially to cellular constituents from which pathways 
originate. 

Pathway perturbations are preferably made in cells of cell types derived from 
30 any organism for which genomic or expressed sequence information is available and for 
which methods are available that permit controllably modification of the expression of 
specific genes. Genome sequencing is currently underway for several eukaryotic 
organisms, including humans, nematodes, Arabidopsis, and flies. In a preferred 
embodiment, the invention is carried out using a yeast, with Saccharomyces cerevisiae most 
35 preferred because the sequence of the entire genome of a S. cerevisiae strain has been 

determined. In addition, well-established methods are available for controllably modifying 
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expression of year genes. A preferred strain of yeast is a S. cerevisiae strain for which yeast 
genomic sequence is known, such as strain S288C or substantially isogeneic derivatives of 
it {see, e.g., Dujon et al, 1994, Nature 369:371-378; Bussey et al, 1995, Proc. Natl. Acad. 
Sci U.S.A. 92:3809-3813; Feldmann et al., 1994, E.M.B.O. J. 73:5795-5809; Johnstone/ 
5 al, 1994, Science 265:2077-2082; Galibert et al, 1996, E.M.B.O. J. 75:2031-2049). 
However, other strains may be used as well. Yeast strains are available, e.g., from 
American Type Culture Collection, 10801 University Boulevard, Manassas, Virginia 
201 10-2209. Standard techniques for manipulating yeast are described in C. Kaiser, S. 
Michaelis, & A. Mitchell, 1994, Methods in Yeast Genetics: A Cold Spring Harbor 
10 Laboratory Course Manual, Cold Spring Harbor Laboratory Press, New York; and 
Sherman et al, 1986, Methods in Yeast Genetics: A Laboratory Manual, Cold Spring 
Harbor Laboratory, Cold Spring Harbor. New York. 

The exemplary methods described in the following include use of titratable 
expression systems, use of transfection or viral transduction systems, direct modifications to 
15 RNA abundances or activities, direct modifications of protein abundances, and direct 
modification of protein activities including use of drugs (or chemical moieties in general) 
with specific known action. 

5.3.1. TITRATABLE EXPRESSION SYSTEMS 
2Q Any of the several known titratable, or equivalently controllable, expression 

systems available for use in the budding yeast Saccharomyces cerevisiae are adaptable to 
this invention (Mumberg et al., 1994, Nucl. Acids Res. 22:5767-5768). Usually, gene 
expression is controlled by transcriptional controls, with the promoter of the gene to be 
controlled replaced on its chromosome by a controllable, exogenous promoter. The most 
25 commonly used controllable promoter in yeast is the GAL1 promoter (Johnston et al, 1984, 
Mol Cell Biol 5:1440-1448). The GAL1 promoter is strongly repressed by the presence of 
glucose in the growth medium, and is gradually switched on in a graded manner to high 
levels of expression by the decreasing abundance of glucose and the presence of galactose. 
The GAL1 promoter usually allows a 5-100 fold range of expression control on a gene of 
30 interest. 

Other frequently used promoter systems include the MET25 promoter 
(Kerjan et al, 1986, Nucl Acids. Res. 74:7861-7871), which is induced by the absence of 
methionine in the growth medium, and the CUP1 promoter, which is induced by copper 
(Mascorro-Gallardoetal., 1996, Gene 772:169-170). All of these promoter systems are 
35 controllable in that gene expression can be incrementally controlled by incremental changes 
in the abundances of a controlling moiety in the growth medium. 
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One disadvantage of the above listed expression systems is that control of 
promoter activity (effected by, e.g., changes in carbon source, removal of certain amino 
acids), often causes other changes in cellular physiology which independently alter the 
expression levels of other genes. A recently developed system for yeast, the Tet system, 
5 alleviates this problem to a large extent (Gari et al., 1997, Yeast 75:837-848). The Tet 
promoter, adopted from mammalian expression systems (Gossen et al., 1995, Proc. Nat. 
Acad. Sci. USA 59:5547-5551) is modulated by the concentration of the antibiotic 
tetracycline or the structurally related compound doxycycline. Thus, in the absence of 
doxycycline, the promoter induces a high level of expression, and the addition of increasing 
10 levels of doxycycline causes increased repression of promoter activity. Intermediate levels 
gene expression can be achieved in the steady state by addition of intermediate levels of 
drug. Furthermore, levels of doxycycline that give maximal repression of promoter activity 
(10 micrograms/ml) have no significant effect on the growth rate on wild type yeast cells 
(Gari et al., 1997, Yeast 73:837-848). 
j 5 in mammalian cells, several means of titrating expression of genes are 

available (Spencer, 1996, Trends Genet. 72:181-187). As mentioned above, the Tet system 
is widely used, both in its original form, the "forward" system, in which addition of 
doxycycline represses transcription, and in the newer "reverse" system, in which 
doxycycline addition stimulates transcription (Gossen et al., 1995, Proc. Natl. Acad. Sci. 
20 USA 59:5547-5551; Hoffmann et al., 1997, Nucl. Acids. Res. 25:1078-1079; Hofrnann et al., 
1996, Proc. Natl. Acad. Sci. USA 85:5185-5190; Paulus et al., 1996, Journal of Virology 
70:62-67). Another commonly used controllable promoter system in mammalian cells is 
the ecdysone-inducible system developed by Evans and colleagues (No et al., 1996, Proc. 
Nat. Acad. Sci. USA 95:3346-3351), where expression is controlled by the level of 
25 muristerone added to the cultured cells. Finally, expression can be modulated using the 
"chemical-induced dimerization" (CID) system developed by Schreiber, Crabtree, and 
colleagues (Belshaw et al., 1996, Proc. Nat. Acad. Sci. USA 93:4604-4607; Spencer, 1996, 
Trends Genet. 72:181-187) and similar systems in yeast. In this system, the gene of interest 
is put under the control of the CID-responsive promoter, and transfected into cells 
30 expressing two different hybrid proteins, one comprised of a DNA-binding domain fused to 
FKBP12, which binds FK506. The other hybrid protein contains a transcriptional activation 
domain also fused to FKBP12. The CID inducing molecule is FK1012, a homodimeric 
version of FK506 that is able to bind simultaneously both the DNA binding and 
transcriptional activating hybrid proteins. In the graded presence of FK1012, graded 
35 transcription of the controlled gene is activated. 
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For each of the mammalian expression systems described above, as is widely 
known to those of skill in the art, the gene of interest is put under the control of the 
controllable promoter, and a plasmid harboring this construct along with an antibiotic 
resistance gene is transfected into cultured mammalian cells. In general, the plasmid DNA 
5 integrates into the genome, and drug resistant colonies are selected and screened for 
appropriate expression of the regulated gene. Alternatively, the regulated gene can be 
inserted into an episomal plasmid such as pCEP4 (Invitrogen, Inc.), which contains 
components of the Epstein-Barr virus necessary for plasmid replication. 

In a preferred embodiment, titratable expression systems, such as the ones 
10 described above, are introduced for use into cells or organisms lacking the corresponding 
endogenous gene and/or gene activity, e.g., organisms in which the endogenous gene has 
been disrupted or deleted. Methods for producing such "knock outs" are well known to 
those of skill iii the art, see e.g., Pettitt et al, 1996, Development 122AU9-A151; Spradling 
era/., 1995, Proc. Natl. Acad. Sci. USA, 92:10824-10830; Ramirez-Solis et al, 1993, 
15 Methods Enzymol. 225:855-878; and Thomas et al, 1987, Cell 57:503-512. 

5.3.2. TRANSFECTION SYSTEMS FOR MAMMALIAN CELLS 
Transfection or viral transduction of target genes can introduce controllable 
perturbations in biological pathways in mammalian cells. Preferably, transfection or 
20 transduction of a target gene can be used with cells that do not naturally express the target 
gene of interest. Such non-expressing cells can be derived from a tissue not normally 
expressing the target gene or the target gene can be specifically mutated in the cell. The 
target gene of interest can be cloned into one of many mammalian expression plasmids, for 
example, the pcDNA3.1 +/- system (Invitrogen, Inc.) or retroviral vectors, and introduced 
25 into the non-expressing host cells. Transfected or transduced cells expressing the target 
gene may be isolated by selection for a drug resistance marker encoded by the expression 
vector. The level of gene transcription is monotonically related to the transfection dosage. 
In this way, the effects of varying levels of the target gene may be investigated. 

A particular example of the use of this method is the search for drugs that 
30 target the src-family protein tyrosine kinase, lck, a key component of the T cell receptor 
activation pathway (Anderson*/ al, 1994, Adv. Immunol. 56:171-178). Inhibitors of this 
enzyme are of interest as potential immunosuppressive drugs (Hanke JH, 1996, J. Biol 
Chem 27Jf2):695-701). A specific mutant of the Jurkat T cell line (JcaMl) is available that 
does not express lck kinase (Straus et al, 1992, Cell 70:585-593). Therefore, introduction 
35 of the lck gene into JCaMl by transfection or transduction permits specific perturbation of 
pathways of T cell activation regulated by the lck kinase. The efficiency of transfection or 
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transduction, and thus the level of perturbation, is dose related. The method is generally 
useful for providing perturbations of gene expression or protein abundances in cells not 
normally expressing the genes to be perturbed. 

5 5.3.3. METHODS OF MODIFYING RNA 

ABUNDANCES OR ACTIVITIES 

Methods of modifying RNA abundances and activities currently fall within 
three classes, ribozymes, antisense species, and RNA aptamers (Good et aL, 1997, Gene 
Therapy 4: 45-54). Controllable application or exposure of a cell to these entities permits 
o controllable perturbation of RNA abundances. 

Ribozymes are RNAs which are capable of catalyzing RNA cleavage 
reactions. (Cech, 1987, Science 236:1532-1539; PCT International Publication 
WO 90/11364, published October 4, 1990; Sarver et aL, 1990, Science 247: 1222-1225). 
"Hairpin" and "hammerhead" RNA ribozymes can be designed to specifically cleave a 
particular target mRNA. Rules have been established for the design of short RNA 
molecules with ribozyme activity, which are capable of cleaving other RNA molecules in a 
highly sequence specific way and can be targeted to virtually all kinds of RNA. (Haseloff 
et aL, 1988, Nature 534:585-591; Koizumi et aL, 1988, FEB S Lett 225:228-230; Koizumi 
et aL, 1988, FEBS Lett. 239:285-288). Ribozyme methods involve exposing a cell to, 
inducing expression in a cell, etc. of such small RNA ribozyme molecules. (Grassi and 
Marini, 1996, Annals of Medicine 28: 499-510; Gibson, 1996, Cancer and Metastasis 

Reviews 15: 287-299). 

Ribozymes can be routinely expressed in vivo in sufficient number to be 
catalytically effective in cleaving mRNA, and thereby modifying mRNA abundances in a 
cell. (Cotten et aL, 1989, EMBO J. 5:3861-3866). In particular, a ribozyme coding DNA 
sequence, designed according to the previous rules and synthesized, for example, by 
standard phosphoramidite chemistry, can be ligated into a restriction enzyme site in the 
anticodon stem and loop of a gene encoding a tRNA, which can then be transformed into 
and expressed in a cell of interest by methods routine in the art. Preferably, an inducible 
promoter {e.g., a glucocorticoid or a tetracycline response element) is also introduced into 
this construct so that ribozyme expression can be selectively controlled. tDNA genes {i.e., 
genes encoding tRNAs) are useful in this application because of their small size, high rate 
of transcription, and ubiquitous expression in different kinds of tissues. Therefore, 
ribozymes can be routinely designed to cleave virtually any mRNA sequence, and a cell can 
be routinely transformed with DNA coding for such ribozyme sequences such that a 
controllable and catalytically effective amount of the ribozyme is expressed. Accordingly 
the abundance of virtually any RNA species in a cell can be perturbed. 
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In another embodiment, activity of a target KNA (preferable mRNA) species, 
specifically its rate of translation, can be controllably inhibited by the controllable 
application of antisense nucleic acids. An "antisense" nucleic acid as used herein refers to a 
nucleic acid capable of hybridizing to a sequence-specific {e.g., non-poly A) portion of the 

5 target RNA, for example its translation initiation region, by virtue of some sequence 

complementarity to a coding and/or non-coding region. The antisense nucleic acids of the 
invention can be oligonucleotides that are double-stranded or single-stranded, RNA or DNA 
or a modification or derivative thereof, which can be directly administered in a controllable 
manner to a cell or which can be produced intracellularly by transcription of exogenous, 

10 introduced sequences in controllable quantities sufficient to perturb translation of the target 
RNA. 

Preferably, antisense nucleic acids are of at least six nucleotides and are 
preferably oligonucleotides (ranging from 6 to about 200 oligonucleotides). In specific 
aspects, the oligonucleotide is at least 10 nucleotides, at least 15 nucleotides, at least 100 
15 nucleotides, or at least 200 nucleotides. The oligonucleotides can be DNA or RNA or 
chimeric mixtures or derivatives or modified versions thereof, single-stranded or double- 
stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or 
phosphate backbone. The oligonucleotide may include other appending groups such as 
peptides, or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al, 
20 1989, Proc. Natl. Acad. Sci. U.S.A. 86: 6553-6556; Lemaitre et al, 1987, Proc. Natl. Acad. 
Sci. U.S.A. 84: 648-652; PCT Publication No. WO 88/09810, published December 15, 
1988), hybridization-triggered cleavage agents (see, e.g., Krol et al, 1988, BioTechniques 
6: 958-976) or intercalating agents (see, e.g., Zon, 1988, Pharm. Res. 5: 539-549). 

In a preferred aspect of the invention, an antisense oligonucleotide is 
25 provided, preferably as single-stranded DNA. The oligonucleotide may be modified at any 
position on its structure with constituents generally known in the art. 

The antisense oligonucleotides may comprise at least one modified base 
moiety which is selected from the group including but not limited to 5-fluorouracil, 
5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 
30 5-(carboxyhydroxylmethyl) uracil, 5-carboxymemylaminomethyl-2-thiouridine, 

5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, 
N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta- 
35 D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6- 
isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 
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2- thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil- 
5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. 

In another embodiment, the oligonucleotide comprises at least one modified 
5 sugar moiety selected from the group including, but not limited to, arabinose, 
2-fluoroarabinose, xylulose, and hexose. 

In yet another embodiment, the oligonucleotide comprises at least one 
modified phosphate backbone selected from the group consisting of a phosphorothioate, a 
phosphorodithioate, a phosphoramidothioate, a phosphoramidate, a phosphordiamidate, a 
10 methylphosphonate, an alkyl phosphotriester, and a formacetal or analog thereof. 

In yet another embodiment, the oligonucleotide is a 2-a-anomeric 
oligonucleotide. An cc-anomeric oligonucleotide forms specific double-stranded hybrids 
with complementary RNA in which, contrary to the usual B-units, the strands run parallel to 
each other (Gautier et aL, 1987, Nucl. Acids Res. 15: 6625-6641). 
1 5 The oligonucleotide may be conjugated to another molecule, e.g. , a peptide, 

hybridization triggered cross-linking agent, transport agent, hybridization-triggered 
cleavage agent, etc. 

The antisense nucleic acids of the invention comprise a sequence 
complementary to at least a portion of a target RNA species. However, absolute 
20 complementarity, although preferred, is not required. A sequence "complementary to at 
least a portion of an RNA," as referred to herein, means a sequence having sufficient 
complementarity to be able to hybridize with the RNA, forming a stable duplex; in the case 
of double-stranded antisense nucleic acids, a single strand of the duplex DNA may thus be 
tested, or triplex formation may be assayed. The ability to hybridize will depend on both 
25 the degree of complementarity and the length of the antisense nucleic acid. Generally, the 
longer the hybridizing nucleic acid, the more base mismatches with a target RNA it may 
contain and still form a stable duplex (or triplex, as the case may be). One skilled in the art 
can ascertain a tolerable degree of mismatch by use of standard procedures to determine the 
melting point of the hybridized complex. The amount of antisense nucleic acid that will be 
30 effective in the inhibiting translation of the target RNA can be determined by standard assay 
techniques. 

Oligonucleotides of the invention may be synthesized by standard methods 
known in the art, e.g. by use of an automated DNA synthesizer (such as are commercially 
available from Biosearch, Applied Biosystems, etc.). As examples, phosphorothioate 
35 oligonucleotides may be synthesized by the method of Stein et ah (1988, Nucl. Acids Res. 
16: 3209), methylphosphonate oligonucleotides can be prepared by use of controlled pore 
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glass polymer supports (Sarin et al, 1988, Proc. Natl. Acad. Sci. U.S.A. 85: 7448-7451), 
etc. In another embodiment, the oligonucleotide is a 2'-0-methylribonucleotide (Inoue et 
al, 1987, Nucl. Acids Res. 15: 6131-6148), or a chimeric RNA-DNA analog (Inoue et al, 

1987, FEBSLett. 215: 327-330). 
5 The synthesized antisense oligonucleotides can then be administered to a cell 

in a controlled manner. For example, the antisense oligonucleotides can be placed in the 

growth environment of the cell at controlled levels where they may be taken up by the cell. 

The uptake of the antisense oligonucleotides can be assisted by use of methods well known 

in the art. 

10 in an alternative embodiment, the antisense nucleic acids of the invention are 

controllably expressed intracellularly by transcription from an exogenous sequence. For 
example, a vector can be introduced in vivo such that it is taken up by a cell, within which 
cell the vector or a portion thereof is transcribed, producing an antisense nucleic acid (RNA) 
of the invention. Such a vector would contain a sequence encoding the antisense nucleic 
15 acid. Such a vector can remain episomal or become chromosomally integrated, as long as it 
can be transcribed to produce the desired antisense RNA. Such vectors can be constructed 
by recombinant DNA technology methods standard in the art. Vectors can be plasmid, 
viral, or others known in the art, used for replication and expression in mammalian cells. 
Expression of the sequences encoding the antisense RNAs can be by any promoter known 
20 in the art to act in a cell of interest. Such promoters can be inducible or constitutive. Most 
preferably, promoters are controllable or inducible by the administration of an exogenous 
moiety in order to achieve controlled expression of the antisense oligonucleotide. Such 
controllable promoters include the Tet promoter. Less preferably usable promoters for 
mammalian cells include, but are not limited to: the SV40 early promoter region (Bemoist 
25 and Chambon, 1981, Nature 290: 304-310), the promoter contained in the 3' long terminal 
repeat of Rous sarcoma virus (Yamamoto et al., 1980, Cell 22: 787-797), the herpes 
thymidine kinase promoter (Wagner et al, 1981, Proc. Natl. Acad. Sci. U.S.A. 78: 1441- 
1445), the regulatory sequences of the metallothionein gene (Brinster et al., 1982, Nature 
296: 39-42), etc. 

3 Q Therefore, antisense nucleic acids can be routinely designed to target 

virtually any mRNA sequence, and a cell can be routinely transformed with or exposed to 
nucleic acids coding for such antisense sequences such that an effective and controllable 
amount of the antisense nucleic acid is expressed. Accordingly the translation of virtually 
any RNA species in a cell can be controllably perturbed. 

35 Finally, in a further embodiment, RNA aptamers can be introduced into or 

expressed in a cell. RNA aptamers are specific RNA ligands for proteins, such as for Tat 
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and Rev RNA (Good et al, 1997, Gene Therapy 4: 45-54) that can specifically inhibit their 
translation. 

In specific embodiments of the invention methods of modifying RNA 
abundances and activities are used to modify an RNA corresponding to a target gene or 
5 reporter gene of the invention. In other specific embodiments of the invention, a ribozymes, 
antisense species, and RNA aptamers directed to a target gene of the invention is used as a 
drug or therapeutic agent. 

5.3.4. METHODS OF MODIFYING PROTEIN ABUNDANCES 
10 Methods of modifying protein abundances include, inter alia, those altering 

protein degradation rates and those using antibodies (which bind to proteins affecting 
abundances of activities of native target protein species). Increasing (or decreasing) the 
degradation rates of a protein species decreases (or increases) the abundance of that species. 
Methods for controllably increasing the degradation rate of a target protein in response to 
15 elevated temperature and/or exposure to a particular drug, which are known in the art, can 
be employed in this invention. For example, one such method employs a heat-inducible or 
drug-inducible N-terminal degron, which is an N-terminal protein fragment that exposes a 
degradation signal promoting rapid protein degradation at a higher temperature (e.g., 37° C) 
and which is hidden to prevent rapid degradation at a lower temperature (e.g., 23° C) 
20 (Dohmen et al, 1994, Science 2(53:1273-1276). Such an exemplary degron is Arg-DHFR K , 
a variant of murine dihydrofolate reductase in which the N-terminal Val is replaced by Arg 
and the Pro at position 66 is replaced with Leu. According to this method, for example, a 
gene for a target protein, P, is replaced by standard gene targeting methods known in the art 
(Lodish et al., 1995, Molecular Biology of the Cell, Chpt. 8, New York: W.H. Freeman and 
25 Co.) with a gene coding for the fusion protein Ub-Arg-DHFR tt -P ("Ub" stands for 

ubiquitin). The N-terminal ubiquitin is rapidly cleaved after translation exposing the N- 
terminal degron. At lower temperatures, lysines internal to Arg-DHFR tt are not exposed, 
ubiquitination of the fusion protein does not occur, degradation is slow, and active target 
protein levels are high. At higher temperatures (in the absence of methotrexate), lysines 
30 internal to Arg-DHFR* are exposed, ubiquitination of the fusion protein occurs, degradation 
is rapid, and active target protein levels are low. Heat activation of degradation is 
controllably blocked by exposure methotrexate. This method is adaptable to other N- 
terminal degrons which are responsive to other inducing factors, such as drugs and 
temperature changes. 

35 Target protein abundances and also, directly or indirectly, their activities can 

also be decreased by (neutralizing) antibodies. By providing for controlled exposure to 
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such antibodies, protein abundances/activities can be controllably modified. For example, 
antibodies to suitable epitopes on protein surfaces may decrease the abundance, and thereby 
indirectly decrease the activity, of the wild-type active form of a target protein by 
aggregating active forms into complexes with less or minimal activity as compared to the 
5 wild-type unaggregated wild-type form. Alternately, antibodies may directly decrease 
protein activity by, e.g., interacting directly with active sites or by blocking access of 
substrates to active sites. Conversely, in certain cases, (activating) antibodies may also 
interact with proteins and their active sites to increase resulting activity. In either case, 
antibodies (of the various types to be described) can be raised against specific protein 
10 species (by the methods to be described) and their effects screened. The effects of the 
antibodies can be assayed and suitable antibodies selected that raise or lower the target 
protein species concentration and/or activity. Such assays involve introducing antibodies 
into a cell (see below), and assaying the concentration of the wild-type amount or activities 
of the target protein by standard means (such as immunoassays) known in the art. The net 
15 activity of the wild-type form can be assayed by assay means appropriate to the known 

activity of the target protein. 

Antibodies can be introduced into cells in numerous fashions, including, for 
example, microinjection of antibodies into a cell (Morgan et al, 1988, Immunology Today 
9:84-86) or transforming hybridoma mRNA encoding a desired antibody into a cell (Burke 
20 et al, 1984, Cell 36:847-858). In a further technique, recombinant antibodies can be 

engineering and ectopically expressed in a wide variety of non-lymphoid cell types to bind 
to target proteins as well as to block target protein activities (Biocca et al, 1995, Trends in 
Cell Biology 5:248-252). Preferably, expression of the antibody is under control of a 
controllable promoter, such as the Tet promoter. A first step is the selection of a particular 
25 monoclonal antibody with appropriate specificity to the target protein (see below). Then 
sequences encoding the variable regions of the selected antibody can be cloned into various 
engineered antibody formats, including, for example, whole antibody, Fab fragments, Fv 
fragments, single chain Fv fragments (V H and V L regions united by a peptide linker) 
("ScFv" fragments), diabodies (two associated ScFv fragments with different specificities), 
30 and so forth (Hayden et al, 1997, Current Opinion in Immunology 9:210-212). 

Intracellular^ expressed antibodies of the various formats can be targeted into cellular 
compartments {e.g., the cytoplasm, the nucleus, the mitochondria, etc.) by expressing them 
as fusions with the various known intracellular leader sequences (Bradbury et al, 1995, 
Antibody Engineering, vol. 2, Borrebaeck ed., IRL Press, pp 295-361). In particular, the 
35 ScFv format appears to be particularly suitable for cytoplasmic targeting. 
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Antibody types include, but are not limited to, polyclonal, monoclonal, 
chimeric, single chain, Fab fragments, and an Fab expression library. Various procedures 
known in the art may be used for the production of polyclonal antibodies to a target protein. 
For production of the antibody, various host animals can be immunized by injection with 
5 the target protein, such host animals include, but are not limited to, rabbits, mice, rats, etc. 
Various adjuvants can be used to increase the immunological response, depending on the 
host species, and include, but are not limited to, Freund's (complete and incomplete), 
mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, 
pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, and potentially useful 
10 human adjuvants such as bacillus Calmette-Guerin (BCG) and corynebacterium parvum. 

For preparation of monoclonal antibodies directed towards a target protein, 
any technique that provides for the production of antibody molecules by continuous cell 
lines in culture may be used. Such techniques include, but are not restricted to, the 
hybridoma technique originally developed by Kohler and Milstein (1975, Nature 256;: 495- 
15 497), the trioraa technique, the human B-cell hybridoma technique (Kozbor et al, 1983, 
Immunology Today 4: 72), and the EBV hybridoma technique to produce human 
monoclonal antibodies (Cole et al, 1985, in Monoclonal Antibodies and Cancer Therapy, 
Alan R. Liss, Inc., pp. 77-96). In an additional embodiment of the invention, monoclonal 
antibodies can be produced in germ-free animals utilizing recent technology 
20 (PCT/US90/02545). According to the invention, human antibodies may be used and can be 
obtained by using human hybridomas (Cote et ai, 1983, Proc. Natl. Acad. Sci. U.S.A. 80: 
2026-2030), or by transforming human B cells with EBV virus in vitro (Cole et al., 1985, in 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). In fact, 
according to the invention, techniques developed for the production of "chimeric 
25 antibodies" (Morrison et al., 1984, Proc. Natl. Acad. Sci. U.S.A. 81: 6851-6855; Neuberger 
et al, 1984, Nature 372:604-608; Takeda et al, 1985, Nature 314: 452-454) by splicing the 
genes from a mouse antibody molecule specific for the target protein together with genes 
from a human antibody molecule of appropriate biological activity can be used; such 
antibodies are within the scope of this invention. 
30 Additionally, where monoclonal antibodies are advantageous, they can be 

alternatively selected from large antibody libraries using the techniques of phage display 
(Marks et al, 1992, J. Biol Chem. 267:16007-16010). Using this technique, libraries of up 
to 10" different antibodies have been expressed on the surface of fd filamentous phage, 
creating a "single pot" in vitro immune system of antibodies available for the selection of 
35 monoclonal antibodies (Griffiths et al, 1994, EMBO J. 73:3245-3260). Selection of 
antibodies from such libraries can be done by techniques known in the art, including 
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contacting the phage to immobilized target protein, selecting and cloning phage bound to 
the target, and subcloning the sequences encoding the antibody variable regions into an 
appropriate vector expressing a desired antibody format. 

According to the invention, techniques described for the production of single 
5 chain antibodies (U.S. patent 4,946,778) can be adapted to produce single chain antibodies 
specific to the target protein. An additional embodiment of the invention utilizes the 
techniques described for the construction of Fab expression libraries (Huse et al, 1989, 
Science 246: 1275-1281) to allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity for the target protein. 
l o Antibody fragments that contain the idiotypes of the target protein can be 

generated by techniques known in the art. For example, such fragments include, but are not 
limited to: the F(ab') 2 fragment which can be produced by pepsin digestion of the antibody 
molecule; the Fab' fragments that can be generated by reducing the disulfide bridges of the 
F(ab') 2 fragment, the Fab fragments that can be generated by treating the antibody molecule 
15 with papain and a reducing agent, and Fv fragments. 

In the production of antibodies, screening for the desired antibody can be 
accomplished by techniques known in the art, e.g., ELISA (enzyme-linked immunosorbent 
assay). To select antibodies specific to a target protein, one may assay generated 
hybridomas or a phage display antibody library for an antibody that binds to the target 
20 protein. 

5.3.5. METHODS OF MODIFYING PROTEIN ACTIVITIES 

Methods of directly modifying protein activities include, inter alia, dominant 
negative mutations, specific drugs (used in the sense of this application) or chemical 
25 moieties generally, and also the use of antibodies, as previously discussed. 

Dominant negative mutations are mutations to endogenous genes or mutant 
exogenous genes that when expressed in a cell disrupt the activity of a targeted protein 
species. Depending on the structure and activity of the targeted protein, general rules exist 
that guide the selection of an appropriate strategy for constructing dominant negative 
30 mutations that disrupt activity of that target (Hershkowitz, 1987, Nature 329:219-222). In 
the case of active monomeric forms, over expression of an inactive form can cause 
competition for natural substrates or ligands sufficient to significantly reduce net activity of 
the target protein. Such over expression can be achieved by, for example, associating a 
promoter, preferably a controllable or inducible promoter, of increased activity with the 
35 mutant gene. Alternatively, changes to active site residues can be made so that a virtually 
irreversible association occurs with the target ligand. Such can be achieved with certain 
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tyrosine kinases by careful replacement of active site serine residues (Perlmutter et aL 9 
1996, Current Opinion in Immunology 5:285-290). 

In the case of active multimeric forms, several strategies can guide selection 
of a dominant negative mutant. Multimeric activity can be controllably decreased by 
5 expression of genes coding exogenous protein fragments that bind to multimeric association 
domains and prevent multimer formation. Alternatively, controllable over expression of an 
inactive protein unit of a particular type can tie up wild-type active units in inactive 
multimers, and thereby decrease multimeric activity (Nocka et al 9 1990, EMBOJ. 9:1805- 
1813). For example, in the case of dimeric DNA binding proteins, the DNA binding 
10 domain can be deleted from the DNA binding unit, or the activation domain deleted from 
the activation unit. Also, in this case, the DNA binding domain unit can be expressed 
without the domain causing association with the activation unit. Thereby, DNA binding 
sites are tied up without any possible activation of expression. In the case where a 
particular type of unit normally undergoes a conformational change during activity, 
1 5 expression of a rigid unit can inactivate resultant complexes. For a further example, 
proteins involved in cellular mechanisms, such as cellular motility, the mitotic process, 
cellular architecture, and so forth, are typically composed of associations of many subunits 
of a few types. These structures are often highly sensitive to disruption by inclusion of a 
few monomeric units with structural defects. Such mutant monomers disrupt the relevant 
20 protein activities and can be controllably expressed in a cell. 

In addition to dominant negative mutations, mutant target proteins that are 
sensitive to temperature (or other exogenous factors) can be found by mutagenesis and 
screening procedures that are well-known in the art. 

Also, one of skill in the art will appreciate that expression of antibodies 
25 binding and inhibiting a target protein can be employed as another dominant negative 
strategy. 

5.3.6. DRUGS OF SPECIFIC KNOWN ACTION 

Additionally, activities of certain proteins can be controllably altered by 
30 exposure to exogenous drugs or ligands. In a preferable case, a drug is known that interacts 
with only one target protein in the cell and alters the activity of only that one target protein. 
Graded exposure of a cell to varying amounts of that drug thereby causes graded 
perturbations of pathways originating at that protein. The alteration can be either a decrease 
or an increase of activity. Less preferably, a drug is known and used that alters the activity 
35 of only a few (e.g. , 2-5) target proteins with separate, distinguishable, and non-overlapping 
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effects. Graded exposure to such a drug causes graded perturbations to the several 
pathways originating at the target proteins. 

In a specific embodiment of the invention, when the pathway of interest is 
the yeast ergosterol-pathway, a known drug which acts as an inhibitor of ergosterol- 

5 biosynthesis may be used to perturb the pathway. Ergosterol is the primary membrane 
sterol in fungi and in some trypanosomes. Ergosterol serves a structural role comparable to 
that of cholesterol in mammalian cells, and is essential for the integrity and structure of the 
fungal cell membrane. As depicted in Figure 12, the ergosterol synthesis pathway contains 
at least 18 genes designated ERG1 though EGR26. Several different classes of antifungal 

10 agents exist which target the ergosterol-pathway. Such drugs or agents may be used in 
connection with the methods of the invention. In one embodiment, the a known antifungal 
drug is used to perturb the ergosterol-pathway. Such drugs include but are not limited to the 
following. 

The polyenes are a class of drugs that bind to ergosterol in the fungal 
15 membrane, causing the cells to become leaky and die (Hamilton-Miller, J., 1973, Bacterid. 
Rev. 37:166). Polyenes and derivatives, include drugs such as amphotericin B, nystatin, 
and pimaricin. 

Azoles are a second class of drug which target the ergosterol-pathway. 
Azoles act to inhibit C-14 demethylation of an ergosterol precursor called lanosterol. 
20 Normally in the synthesis of the ergosterol, the EGR1 1 gene product acts to demethylate C- 
14 of lanosterol. Azoles inhibit this process leading to a C-14 methylsterol product. 
Consequently, incorporation of these altered products into the fungal membrane in place of 
ergosterol, leads to reduced membrane fluidity, reduced fungal growth, and reduced 
invasiveness. Azoles, include drugs such as clotrimazole, intraconazole, fluconazole, 
25 miconazole, econazole, sulconazole, and ketoconazole. 

A third class of ergosterol-pathway drug are the allylamines-thiocarbamates 
which act to inhibit squalene epoxidase, the ERG1 gene product. Allylamines- 
thiocarbamates derivatives include naftifine, tolnaftate, and terbinafine. 

The morpholines are a forth class of drug that affect ergosterol synthesis. 
30 Morpholines, such as amorolfine, act to block two separate steps of the ergosterol synthesis 
pathway. Morpholines inhibit C-14 sterol reduction by the ERG24 gene product. 
Morpholines also inhibit isomerization of sterol A8 -*7 by the ERG2 gene product. 

As will be appreciated by one skilled in the art, any known drug associated 
with a particular biological pathway of interest may be used in connection with the methods 
35 of the invention, for example, as an agent to perturb the particular biological pathway. 
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5.4. PREPARING THE MICROARRAY 

The invention herein provides methods of using microarray technology to 
identify reporter genes and target genes of a particular biological pathway. Microarray may 
be prepared by any method known in the art, including but not limited to the preparation 
5 methods described herein below. 

5.4.1. BINDING SITES ON THE MICROARRAYS 
As noted above, the "binding site" to which a particular polynucleotide 
molecule specifically hybridizes according to the invention is usually a complementary 
10 polynucleotide sequence. In one embodiment, the binding sites of the microarray are DNA 
or DNA "mimics" (e.g., derivatives and analogues) corresponding to at least a portion of 
each gene in an organism's genome. In another embodiment, the binding sites of the 
microarray are complementary RNA or RNA mimics. 

DNA mimics are polymers composed of subunits capable of specific, 
1 5 Watson-Crick-like hybridization with DNA, or of specific hybridization with RNA. The 
nucleic acids can be modified at the base moiety, at the sugar moiety, or at the phosphate 
backbone. Exemplary DNA mimics include, e.g., phosphorothioates. 

DNA can be obtain, e.g. , by polymerase chain reaction ("PCR") 
amplification of gene segments from genomic DNA, cDNA (e.g., by RT-PCR), or clones 
20 sequences. PCR primers are preferably chosen based on known sequences of the genes or 
cDNA that result in amplification of unique fragments (e.g, fragments that do not share, 
more than 10 bases of contiguous identical sequence with any other fragment on the 
microarray). Computer programs that are well known in the art are useful in the design of 
primer with the required specificity and optimal amplification properties, such as Oligo 
25 version 5.0 (National Biosciences). Typically, each binding site of the microarray will be 
between about 20 bases and about 12,000 bases, and usually between about 300 bases and 
about 2,000 bases in length, and still more usually between about 300 bases and about 800 
bases in length. PCR methods are well known in the art, and are described, for example, in 
Innis et al, eds., 1990, PCR Protocols: A Guide to Methods and Applications, Academic 
30 Press Inc., San Diego, CA. It will be apparent to one skilled in the art that controlled 
robotic systems are useful for isolating and amplifying nucleic acids. In a specific 
embodiment of the invention, PCR methods are used to amplify ORFs of 5. cerevisiae yeast 
genome. In a further preferred specific embodiment, amplification of yeast genome is 
performed such that each of the known or predicted ORFs in the yeast genome is prepared. 
35 An alternative means for generating the polynucleotide binding sites of the 

microarray is by synthesis of synthetic polynucleotides or oligonucleotides, e.g. , using N- 



-51 - 

BNSDOCID: <WO 0058520A1J_> 



WO 00/58520 



PCT/US00/08555 



phosphonate or phosphoramidite chemistries (Froehler et al, 1986, Nucleic Acid Res. 
74:5399-5407; McBride/a/., 1983, Tetrahedron Lett 24:246-248). Synthetic sequences 
are typically between about 15 and about 500 bases in length, more typically between about 
20 and about 50 bases. In some embodiments, synthetic nucleic acids include non-natural 
bases, such as, but by no means limited to, inosine. As noted above, nucleic acid analogues 
may be used as binding sites for hybridization. An example of a suitable nucleic acid 
analogue is peptide nucleic acid (see, e.g., Egholm et al, 1993, Nature 363:566-568; U.S. 
Patent No. 5,539,083). 

In alternative embodiments, the hybridization sites (i.e., the binding sites) are 
made from plasmid or phage clones of genes, cDNAs {e.g., expressed sequence tags), or 
inserts therefrom (Nguyen et al, 1995, Genomics 29:207-209). 

5.4.2, ATTACHING BINDING SITES TO THE SOLID SURFACE 
Solid supports on which binding sites of microarrays may be immobilized 
15 are well-known in the art and include filter materials, such as nitrocellulose, cellulose 

acetate, nylon, and polyester, among others, as well as non-porous materials, such as glass, 
plastic (e.g., polypropylene), poly acrylamide, and silicon. In general, non-porous supports, 
and glass in particular, are preferred. The solid support may also be treated in such a way as 
to enhance binding of oligonucleotides thereto, or to reduce non-specific binding of 
20 unwanted substances thereto. For example, it is often desirable to treat a glass support with 
poly lysine or silane to facilitate attachment of binding sites such as oligonucleotides to the 
glass. A preferred method for attaching binding sites such as nucleic acids to a surface is by 
printing on glass plates, as is described generally by Schena et al, 1995, Science 270:467- 
470. This method is especially useful for preparing microarrays of cDNA (See also, DeRisi 
25 et al, 1996, Nature Genetics 74:457-460; Shalon et al, 1996, Genome Res. 6:689-645; 
and Schena et al, 1995, Proc. Natl Acad. Sci. U.S.A. 95:10539-11286). Blanchard 
discloses the use of an ink jet printer for oligonucleotide synthesis (U.S. Application Serial 
No. 09/008,120, filed Jan. 16, 1998). 

Methods of immobilizing binding sites on the solid support may include 
30 direct touch, micropipetting (Yershov, K et al., Genetics 93: 4913, 1996), or the use of 
controlled electric fields to direct a given oligonucleotide to a specific spot in the array 
(U.S. Patent 5,605,662 issued to Heller et al.). In a specific embodiment, DNA is typically 
immobilized at a density of 100 to 10,000 oligonucleotides per cm 2 and preferably at a 
density of about 1000 oligonucleotides per cm 2 
35 In a preferred embodiment, binding sites (e.g., oligonucleotides) are 

synthesized directly on said support (Maskos, U et al., 1993, Nucl. Acids Res. 21 : 2267; 
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Fodor, S. P et al., 1991, Science 281:767; Blanchard et al., 1996, Biosens. Bioelectron. 11: 
687). Among methods of synthesizing oligonucleotides directly on a solid support, 
particularly preferred method are photolithography (see e.g., Fodor, supra., and McGall et 
al.,1996, Proc. Natl. Acad. Sci. (USA) 93: 13555, 1996) and most preferred, piezoelectric 
5 printing (see e.g., Blanchard, supra) . 

A second preferred method for making microarrays is by making high- 
density oligonucleotide arrays. Techniques are known for producing arrays containing 
thousands of oligonucleotides complementary to defined sequences, at defined locations on 
a surface using photolithographic techniques for synthesis in situ (see, Fodor et al., 1991, 
10 Science 257:767-773; Pease et al., 1994, Proc. Natl. Acad. Sci. U.S.A. 97:5022-5026; 
Lockhart et al, 1996, Nature Biotechnology 74:1675; U.S. Patent Nos. 5,578,832; 
5,556,752; and 5,5 10,270) or other methods for rapid synthesis and deposition of defined 
oligonucleotides (Blanchard et al, Biosensors & Bioelectronics 77:687-690). When these 
methods are used, oligonucleotides (e.g., 20-mers) of known sequence are synthesized 
15 directly on a surface such as a derivatized glass slides. Usually, the array produced is 

redundant, with several oligonucleotide molecules per RNA. Oligonucleotide binding sites 
can be chosen to detect alternatively spliced mRNAs. 

Other methods for making microarrays, e.g., by masking (Maskos and 
Southern, 1992, Nuc. Acids. Res. 20:1679-1684), may also be used. In principle, any type 
20 of array, for example, dot blots on a nylon hybridization membrane (see Sambrook et al , 
supra) could be used. However, as will be recognized by those skilled in the art, very small 
arrays will frequently be preferred because hybridization volumes will be smaller. 

5.4.3. TARGET POLYNUCLEOTIDES MOLECULES 
25 As described, supra, the polynucleotide molecules which may be analyzed 

by the present invention may be from any source, including naturally occurring nucleic acid 
molecules, as well as synthetic nucleic acid molecules. In a preferred embodiment, the 
polynucleotide molecules analyzed by the invention comprise RNA, including, but by no 
means limited to, total cellular RNA, poly(A) + messenger RNA (mRNA), fractions thereof, 
30 or RNA transcribed from cDNA. In a specific embodiment, Cellular RNA or DNAs from 
two cell populations (e.g., RNA of S. cerevisiae untreated or treated with a specific drug) 
are analyzed by incubating both populations of RNAs with the microarray. In a specific 
embodiment of the invention, S. cerevisiae concentrated or treated with a drug or agent 
known to alter the ergosterol pathway (e.g.. clotrimazole). In yet another specific 
35 embodiment, S. cerevisiae containing a deletion mutation is used to identify gene function. 
Methods for preparing total and poly(A) + RNA are well known in the art, and are described 
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generally, e.g., in Sambrook et al, supra. In one embodiment, RNA is extracted from cells 
of the various types of interest in this invention using guanidinium thiocyanate lysis 
followed by CsCl centrifugation (Chirgwin et al, 1979, Biochemistry 75:5294-5299). Poly 
(A) + RNA is selected by selection with oiigo-dT cellulose. Cells of interest include, but are 
5 by no means limited to, wild-type cells, drug-exposed wild-type cells, modified cells, 
diseased cells, and, in particular, cancer cells. 

In one embodiment, RNA can be fragmented by methods known in the art, 
e.g., by incubation with ZnCl 2 , to generate fragments of RNA. In one embodiment, isolated 
mRNA can be converted to antisense RNA synthesized by in vitro transcription of double- 
1 0 stranded cDNA in the presence of labeled dNTPs (Lockhart et al , 1 996, Nature 
Biotechnology 74:1675). 

In other embodiments, the polynucleotide molecules to be analyzed may be 
DNA molecules such as fragmented genomic DNA, or PCR products of amplified mRNA 
or cDNA. In a preferred embodiment of the invention the polynucleotide molecules to be 
15 analyzed are cDNAs which are reverse transcribed from mRNAs. In a specific embodiment 
of the invention the polynucleotide molecules analyzed are cDNAs reverse transcribed from 
cDNAs of fungal cell treated with antifungal drugs. 

5-4,4. HYBRIDIZATION POLYNUCLEOTIDES 
TO MICRO ARRAYS 

Nucleic acid hybridization and wash conditions are chosen so that the 

polynucleotide molecules to be analyzed by the invention "specifically bind" or 

"specifically hybridize" to the complementary polynucleotide sequences of the array, 

preferably to a specific array site, wherein its complementary DNA is located. 

Arrays containing double-stranded binding site DNA situated thereon are 

preferably subjected to denaturing conditions to render the DNA single-stranded prior to 

contacting with the target polynucleotide molecules. Arrays containing single-stranded 

binding site DNA (e.g., synthetic oligodeoxyribonucleic acids) may need to be denatured 

prior to contacting with the target polynucleotide molecules, e.g., to remove hairpins or 

dimers which form due to self complementary sequences. 

Optimal hybridization conditions will depend on the length (e.g., oligomer 

versus polynucleotide greater than 200 bases) and type (e.g., RNA or DNA) of binding site 

and target nucleic acids. General parameters for specific (i.e., stringent) hybridization 

conditions are described in Sambrook et al (supra), and in Ausubel et al, 1987, Current 

Protocols in Molecular Biology, Greene Publishing and Wiley-Interscience, New York. 

When the cDNA microarrays of Schena et al. (Shena et al, 1996, Proc. Natl Acad. ScL 

U.S.A. 93:10614) are used, typical hybridization conditions are hybridization in 5x SSC 
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plus 0.2% SDS at 65 °C for four hours, followed by washes at 25 °C in high stringency 
wash buffer (O.lx SSC plus 0.2% SDS) (Shena et al, 1996, Proc. Natl. Acad. Sci. U.S.A. 
93:10614). Useful hybridization conditions are also provided, e.g., Tijessen, 1993, 
Hybridization With Nucleic Acid Probes, Elsevier Science Publishers B.V.; and Kricka, 
5 1992, Nonisotopic DNA Probe Techniques, Academic Press, San Diego, CA. 

In a another specific embodiment, use of a nucleic acid which is hybridizable 
to an S. cerevisiae nucleic acid or to its reverse complement, or to a nucleic acid encoding 
an ergosterol derivative, or to its reverse complement, under conditions of low stringency is 
provided. By way of example and not limitation, procedures using such conditions of low 
10 stringency are as follows (see also Shilo and Weinberg, 1981, Proc. Natl. Acad. Sci. U.S.A. 
78, 6789-6792). Arrays containing DNA are pretreated for 6 h at 40°C in a solution 
containing 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mMEDTA, 0.1% PVP, 
0.1% Ficoll, 1% BSA, and 500 ug/ml denatured salmon sperm DNA. Hybridizations are 
carried out in the same solution with the following modifications: 0.02% PVP, 0.02% 
15 Ficoll, 0.2% BSA, 100 ug/ml salmon sperm DNA, 10% (wt/vol) dextran sulfate, and 

5-20 X 10 6 cpm 32 P-labeled probe is used. Arrays are incubated in hybridization mixture for 
18-20 h at 40°C, and then washed for 1.5 h at 55 °C in a solution containing 2X SSC, 25 
mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% SDS. The wash solution is replaced with 
fresh solution and incubated an additional 1.5 h at 60°C. Arrays are blotted dry and 
20 visualized. If necessary, arrays are washed for a third time at 65-68 °C and re-visualized. 
Other conditions of low stringency which may be used are well known in the art (e.g., as 
employed for cross-species hybridizations). 

In another specific embodiment, use of a nucleic acid which is hybridizable 
to an ergosterol nucleic acid, or its reverse complement, under conditions of high stringency 
25 is provided. By way of example and not limitation, procedures using such conditions of 
high stringency are as follows, Prehybridization of arrays containing DNA is carried out for 
8 h to overnight at 65 °C in buffer composed of 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM 
EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, and 500 ug/ml denatured salmon sperm 
DNA. Arrays are hybridized for 48 h at 65 °C in prehybridization mixture containing 100 
30 Jig/ml denatured salmon sperm DNA and 5-20 X 10 6 cpm of 32 P-labeled probe. Washing of 
arrays is done at 37°C for 1 h in a solution containing 2X SSC, 0.01% PVP, 0.01% Ficoll, 
and 0.01% BSA. This is followed by a wash in 0.1X SSC at 50°C for 45 min before 
autoradiography. Other conditions of high stringency which may be used are well known in 
the art. 

35 in another specific embodiment, use of a nucleic acid which is hybridizable 

to an ergosterol nucleic acid, or its reverse complement, under conditions of moderate 
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stringency is provided. Selection of appropriate conditions for such stringencies is well 
known in the art {see e.g., Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, 
2d Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York; see also, 
Ausubel et al., eds., in the Current Protocols in Molecular Biology series of laboratory 
5 technique manuals, © 1987-1997, Current Protocols, © 1994-1997 John Wiley and Sons, 
Inc.). 

In another embodiment, after hybridization, stringency conditions are as 
follows. Each array is washed two times each for 30 minutes each at 45 °C in 40 raM 
sodium phosphate, pH 7,2, 5% SDS, 1 mM EDTA, 0.5% bovine serum albumin, followed 
10 by four washes each for 30 minutes in sodium phosphate, pH 7.2, 1% SDS, 1 mM EDTA, 
and subsequently each array is treated differently as described below for low, medium, or 
high stringency hybridization conditions. For low stringency hybridization, arrays are not 
washed further. For medium stringency hybridization, membranes are additionally 
subjected to four washes each for 30 minutes in 40 mM sodium phosphate, pH 7.2, 1% 
15 SDS, 1 mM EDTA at 55 °C. For high stringency hybridization, following the washes for 
low stringency, membranes are additionally subjected to four washes each for 30 minutes in 
40 mM sodium phosphate, pH 7.2, 1% SDS, 1 mM EDTA at 55 °C, followed by four 
washes each for 30 minutes in sodium phosphate, pH 7.2, 1% SDS, 1 mM EDTA at 65 °C. 

Use of nucleic acids encoding derivatives and analogs of ergosterol-pathway 
20 proteins, and ergosterol antisense nucleic acids for antifungal therapies or drug targets are 
additionally provided. 

Use of fragments of ergosterol nucleic acids comprising regions conserved 
between {i.e., with homology to) other ergosterol nucleic acids, of the same or different 
species, are also provided. 

5.4.5* SIGNAL DETECTION ON HYBRIDIZED 
MICROARRAYS AND DATA ANALYSIS 

It will be appreciated that when cDNA complementary to the mRNA of a 

cell is made and hybridized to a microarray under suitable hybridization conditions, the 

level of hybridization to the site in the array corresponding to any particular gene will 

reflect the prevalence in the cell of mRNA transcribed from that gene. For example, when 

detectably labeled {e.g., with a fluorophore) cDNA complementary to the total cellular 

mRNA is hybridized to a microarray, the site on the array corresponding to a gene {i.e., 

capable of specifically binding the product of the gene) that is not transcribed in the cell will 

have little or no signal {e.g., fluorescent signal), and a gene for which the encoded mRNA is 

prevalent will have a relatively strong signal. 
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In preferred embodiments, cDNAs from two different cells (e.g. untreated 
and drug treated) are hybridized to the binding sites of the microarray. In the case of drug 
responses, one cell is exposed to a drug and another cell of the same type is not exposed to 
the drug. The cDNA derived from each of the two cell types are differently labeled so that 
5 they can be distinguished. In one embodiment, for example, cDNA from a cell treated with 
a drug is synthesized using a fluorescein-labeled dNTP, and cDNA from a second cell, not 
drug-exposed, is synthesized using a rhodamine-labeled dNTP. When the two cDNAs are 
mixed and hybridized to the microarray, the relative intensity of signal from each cDNA set 
is determined for each site on the array, and any relative difference in abundance of a 
1 o particular mRNA is thereby detected. 

In the example described above, the cDNA from the drug-treated cell will 
fluoresce green when the fluorophore is stimulated, and the cDNA from the untreated cell 
will fluoresce red. As a result, when the drug treatment has no effect, either directly or 
indirectly, on the relative abundance of a particular mRNA in a cell, the mRNA will be 
15 equally prevalent in both cells, and, upon reverse transcription, red-labeled and green- 
labeled cDNA will be equally prevalent. When hybridized to the microarray, the binding 
site(s) for that species of RNA will emit wavelength characteristic of both fluorophores. In 
contrast, when the drug-exposed cell is treated with a drug that, directly or indirectly, 
increases the prevalence of the mRNA in the cell, the ratio of green to red fluorescence will 
20 increase. When the drug decreases the mRNA prevalence, the ratio will decrease. 

The use of a two-color fluorescence labeling and detection scheme to define 
alterations in gene expression has been described, {See, e.g., Shena et al., 1995, Science 
270:467-470). An advantage of using cDNA labeled with two different fluorophores is that 
a direct and internally controlled comparison of the mRNA levels corresponding to each 
25 arrayed gene in two cell states can be made, and variations due to minor differences in 
experimental conditions (e.g., hybridization conditions) will not affect subsequent analyses. 
However, it will be recognized that it is also possible to use cDNA from a single cell, and 
compare, for example, the absolute amount of a particular mRNA in, e.g., a drug-treated or 
pathway-perturbed cell and an untreated cell. 
30 When fluorescently labeled probes are used, the fluorescence emissions at 

each site of a transcript array can be, preferably, detected by scanning confocal laser 
microscopy (see e.g., Fodor, S., et al., 1993, Nature 364:555) . In one embodiment, a 
separate scan, using the appropriate excitation line, is carried out for each of the two 
fluorophores used. Among fluorescent dyes that may be used to label DNA and RNA are 
35 fluorescein, lissamine, Cy3, Cy5, phycoerythrin, and rhodamine 1 10. Cy3 and Cy5 are 
particularly preferred. In a specific embodiment, where the sample to be hybridized is a 
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cDNA, labeling is accomplished by incorporating fluoresecently-labeled deoxynucleotide 
triphosphates (dNTPs), such as Cy3 or Cy5-dUTP, during in vitro reverse transcription. 
Fluorescently-labeled dNTPs are commercially available from sources such as Amersham 
Pharmacia Biotech, Piscataway, NJ. Alternatively, cDNAs are labeled indirectly by 
5 incorporating biotinylated nucleotides during cDNA synthesis, followed by the addition of 
fluorescently-labeled avidin or streptavidin. Biotinylated dNTPS are available from Enzo 
(Farmingdale, NY) and Boehringer Mannheim (Indianapolis, IN), while fluorescently- 
labeled avidin and streptavidin are available from Becton Dickinson (Mountain View, CA) 
and Molecular Probes (Eugene, OR). Methods of reverse transcription and labeling are 
10 well-known in the art and are described for example, in Ausbel, F. et aL, eds., 1994, Current 
Protocols in Molecular Biology, New York; DeRisi, J., 1997, Science 278:680-86; and 
Schena, M, et al.,1996, Proc. Natl. Acad Sci.,USA, 93:10614-19. 

Alternatively, a laser can be used that allows simultaneous specimen 
illumination at wavelengths specific to the two fluorpphores and emissions from the two 
15 fluorophores can be analyzed simultaneously (see Shalon et ai, 1996, Genome Res. 6:639- 
645). In a preferred embodiment, the arrays are scanned with a laser fluorescent scanner 
with a computer controlled X-Y stage and a microscope objective. Although simultaneous 
hybridization of differentially labeled cDNA samples is preferred, use of a single label to 
perform hybridizations sequentially rather than simultaneously, may also be performed. 
20 Sequential excitation of the two fluorophores is achieved with a multi-line, mixed gas laser, 
and the emitted light is split by wavelength and detected with two photomultiplier tubes. 
Such fluorescence laser scanning devices are described, e.g., in Schena et al. 9 1996, Genome 
Res. 6:639-645. Alternatively, the fiber-optic bundle described by Ferguson et al y 1996, 
Nature Biotech. 74:1681-1684, may be used to monitor mRNA abundance levels at a large 
25 number of sites simultaneously. 

In one embodiment, where the sample to be hybridized is mRNA, labeling is 
accomplished by incorporating fluoresecently-labeled ribonucleotides or biotinylated 
ribonucleotides during in vitro transcription, as described in Lockhart, D J. et al., 1996, 
Nature Biotech. 14:1675-80. 
30 Although it is preferred to use fluorescent labels, other labels may also be 

employed, such as radioisotopes, enzymes, and luminescers. Such methods are well-known 
to those of skill in the art. 

To probe a DNA microarray, the labeled samples are hybridized to the 
microarray under a fixed set of conditions, such as sample concentration, temperature, 
35 buffer and salt concentration, incubation time, etc (see e.g., Section 5.4.4, herein). After 
washing to remove unbound sample, the microarray is excited with specific wavelengths of 
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light and scanned to detect fluorescence. Typically, two samples, each labeled with a 
different fluor, are hybridized simultaneously to permit differential expression 
measurements. When neither sample hybridizes to a given spot in the array, no fluorescence 
is detected. When only one sample hybridizes to a given spot, the color of the resulting 
5 fluorescence will correspond to that of the fluor used to label the hybridizing sample (e.g., 
green when the sample was labeled with fluorescein, or red, if the sample was labeled with 
rhodamine). When both samples hybridize to the same spot, an combinatorial color is 
produced (e.g., yellow if the samples were labeled with fluorescein and rhodamine). Then, 
applying methods of pattern recognition and data analysis as described herein and in U.S. 
10 patent application serial no. 09/179,569, filed October 27, 1998, now pending, in U.S. 

patent application serial no. 09/220,275, filed December 23, 1998, now pending, and in U.S. 
patent application serial no. 09/220,142 filed December 23, 1998, now pending each of 
which are incorporated herein by reference in their entirety, it is possible to quantify 
differences in gene expression between the samples. 
15 Signals are recorded and, in a preferred embodiment, analyzed by computer, 

e.g., using a 12 bit analog to digital board. In one embodiment, the scanned image is 
despeckled using a graphics program {e.g., Hijaak Graphics Suite) and then analyzed using 
an image gridding program that creates a spreadsheet of the average hybridization at each 
wavelength at each site. If necessary, an experimentally determined correction for "cross 
20 talk" (or overlap) between the channels for the two fluorophores may be made. For any 
particular hybridization site on the transcript array, a ratio of the emission of the two, 
fluorophores can be calculated. The ratio is independent of the absolute expression level of 
the cognate gene, but is useful for genes whose expression is significantly modulated by 
drug administration, gene deletion, or any other tested event. 
25 According to the method of the invention, the relative abundance of an 

mRNA in two cells or cell lines is scored as a perturbation and its magnitude determined 
(i.e., the abundance is different in the two sources of mRNA tested) or as not perturbed (i.e., 
the relative abundance is the same, see U.S. Patent serial No. 09/179,569, filed October 27, 
1998, U.S. Patent serial No. 09/220,142, filed December 23, 1998 now pending, U.S. Patent 
30 serial No. 09/220,275 filed December 23, 1998, which are incorporated herein by reference 
in their entirety). As used herein, a difference between the two sources of RNA of at least a 
factor of about 25% (i.e., RNA is 25% more abundant in one source than in the other 
source), more usually about 50%, even more often by a factor of about 2 (i.e., twice as 
abundant), 3 (three times as abundant), or 5 (five times as abundant) is scored as a 
35 perturbation. Present detection methods allow reliable detection of difference of an order of 
about 3-fold to about 5-fold, but more sensitive methods are expected to be developed. 
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Preferably, in addition to identifying a perturbation as positive or negative, it 
is advantageous to determine the magnitude of the perturbation. This can be carried out, as 
noted above, by calculating the ratio of the emission of the two fluorophores used for 
differential labeling, or by analogous methods that will be readily apparent to those of skill 
5 in the art. 

5.4.6. OTHER METHODS OF TRANSCRIPTIONAL 
STATE MEASUREMENT 

The transcriptional state of a cell may be measured by other gene expression 

technologies known in the art. Several such technologies produce pools of restriction 

fragments of limited complexity for electrophoretic analysis, such as methods combining 

double restriction enzyme digestion with phasing primers (see, e.g., European Patent O 

534858 Al, filed September 24, 1992, by Zabeau et a/.), or methods selecting restriction 

fragments with sites closest to a defined mRNA end {see e.g., Prashar et al., 1996, Proc. 

Natl. Acad. Sci. U.S.A. 93:659-663). Other methods statistically sample cDNA pools, such 

as by sequencing sufficient bases {e.g., 20-50 bases) in each of multiple cDNAs to identify 

each cDNA, or by sequencing short tags {e.g., 9-10 bases) which are generated at known 

positions relative to a defined mRNA end {see e.g., Velculescu, 1995, Science 270:484- 

487). 

Such methods and systems of measuring transcriptional state, although less 
preferable than microarrays, may, nevertheless, be used in the present invention. 

5.4.7. MEASUREMENT OF OTHER ASPECTS 
OF BIOLOGICAL STATE 

In various embodiments of the present invention, aspects of the biological 

state other than the transcriptional state, such as the translational state, the activity state, or 

mixed aspects can be measured in order to obtain drug and pathway responses. Details of 

these embodiments are described in this section. 

5.4.7.1. EMBODIMENTS BASED ON TRANSLATIONAL 
30 STATE MEASUREMENTS 

Measurement of the translational state may be performed according to 

several methods. For example, whole genome monitoring of protein {i.e., the "proteome," 

Goffeau et ai, supra) can be carried out by constructing a microarray in which binding sites 

comprise immobilized, preferably monoclonal, antibodies specific to a plurality of protein 

35 species encoded by the cell genome. Preferably, antibodies are present for a substantial 

fraction of the encoded proteins, or at least for those proteins relevant to the action of a drug 
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of interest. Methods for making monoclonal antibodies are well known (see, e.g., Harlow 
and Lane, 1988, Antibodies: A Laboratory Manual, Cold Spring Harbor, New York, which 
is incorporated in its entirety for all purposes). In a preferred embodiment, monoclonal 
antibodies are raised against synthetic peptide fragments designed based on genomic 
5 sequence of the cell. With such an antibody array, proteins from the cell are contacted to 
the array and their binding is assayed with assays known in the art. 

Alternatively, proteins can be separated by two-dimensional gel 
electrophoresis systems. Two-dimensional gel electrophoresis is well-known in the art and 
typically involves iso-electric focusing along a first dimension followed by SDS-PAGE 
10 electrophoresis along a second dimension. See, e.g., Hames et al, 1990, Gel 

Electrophoresis of Proteins: A Practical Approach, IRL Press, New York; Shevchenko et al, 
1996, Proc. Nat'l Acad. Sci. USA 93:1440-1445; Sagliocco et al, 1996, Yeast 12:1519- 
1533; Lander, 1996, Science 274:536-539. The resulting electropherograms can be 
analyzed by numerous techniques, including mass spectrometric techniques, western 
15 blotting and immunoblot analysis using polyclonal and monoclonal antibodies, and internal 
and N-terminal micro-sequencing. Using these techniques, it is possible to identify a 
substantial fraction of all the proteins produced under given physiological conditions, 
including in cells (e.g., in yeast) exposed to a drug, or in cells modified by, e.g., deletion or 
over-expression of a specific gene. 

20 

5.4.7.2. EMBODIMENTS BASED ON OTHER ASPECTS 
OF THE BIOLOGICAL STATE 

Even though methods of this invention are illustrated by embodiments 
involving gene expression profiles, the methods of the invention are applicable to any 
cellular constituent that can be monitored. 

In particular, where activities of proteins relevant to the characterization of a 
perturbation, such as drug action, can be measured, embodiments of this invention can be 
based on such measurements. Activity measurements can be performed by any functional, 
biochemical, or physical means appropriate to the particular activity being characterized. 
Where the activity involves a chemical transformation, the cellular protein can be contacted 

30 with the natural substrate(s), and the rate of transformation measured. Where the activity 
involves association in multimeric units, for example association of an activated DNA 
binding complex with DNA, the amount of associated protein or secondary consequences of 
the association, such as amounts of mRNA transcribed, can be measured. Also, where only 
a functional activity is known, for example, as in cell cycle control, performance of the 

35 function can be observed. However known and measured, the changes in protein activities 
form the response data analyzed by the foregoing methods of this invention. 
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In alternative and non-limiting embodiments, response data may be formed of 
mixed aspects of the biological state of a cell. Response data can be constructed from, e.g., 
changes in certain mRNA abundances, changes in certain protein abundances, and changes 
in certain protein activities. 

5 

5.5. DRUG DEVELOPMENT WITH TARGET GENES 

The invention provides methods for the identification of target genes which 
may be used for the development of drugs and therapeutic agents that target a pathway of 
interest. By way of example, the invention is illustrated in terms of an ergosterol-pathway 
10 target gene; however, one skilled in the art will appreciate that the methods described herein 
may be applied to any pathway of interest and used for the development of drugs and/or 
therapeutic agents which target the pathway of interest. For example, one pathway of 
interest is the ergosterol-pathway of yeast. As described above, a target gene, for a pathway 
such as the ergosterol-pathway may be identified by the methods of the invention, (e.g., by 
15 using cluster analysis followed by validation of the gene as a target). Target genes of the 
ergosterol-pathway, may be used in controlling fungal infection of human, animal, or plant 
species. For example, the proteins encoded by a novel target gene of the ergosterol- 
pathway provide targets for antifungal and fungicidal agents. For example, a drug may be 
developed to inhibit an essential ergosterol-pathway target gene or the protein encoded by 
20 such a gene. Inhibition of an essential target gene or protein thus modifies the growth, 
reproduction, and/or survival of a fungus containing the essential target gene, and thus is 
used as antifungal or fungicidal agent. In yet another embodiment, the drug of therapeutic 
agent is a dominant negative form of an ergosterol-pathway protein, which inactivates the 
protein encoded by the target gene of the ergosterol-pathway and may be used as an 
25 antifungal or fungicidal agent. In yet another embodiment, antisense ergosterol-pathway 
nucleic acids may be used to inactivate an essential target gene, and therefore provide an 
antifungal or fungicidal agent. Further, as will be appreciated by one skilled in the art, when 
a target gene is discovered by the methods of the invention, such a target may be found in 
species other than that which the target gene was first discovered, and may provide useful 
30 drug targets in such species. For example, if a target gene of the ergosterol-pathway is 
discovered in S. cerevisiae this gene is not only a target for antifungal or fungicidal drug 
development against the S. cerevisiae, but may lead to the development of antifungal or 
fungicidal agents for other fungal species as well. 

Fungi which may used or tested in connection with the methods of the 
35 invention include but are not limited to: Cryptococcus species, including Cryptococcus 
neoformans; Blastomyces species, including Blastomyces dermatitidis\ Aiellomyces species, 
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including Aiellomyces dermatitidis; Histoplasfria species, including Histoplasfria 
capsulatum; Coccidioldes species, including Coccidioides immitis; Candids species, 
including C. albicans, C. tropicalis, C. parapsilosis, C. guilliermondii, and C. krusei, 
Aspergillus species, including A. fumigatus, A.flavus, and A. niger, Rhizopus species; 
5 Rhizomucor species; Cunninghammella species; Apophysomyces species, including A. 
saksenaea, A. mucor, A. absidia; Sporothrix species, including Sporothrix schencldi; 
Paracoccidioides species, including Paracoccidioides brasiliensis; Pseudallescheria 
species, including Pseudallescheria boydii; Torulopsis species, including Torulopsis 
glabrata; Dermatophyres species; Histoplasma species; Pneumocystis species; Blastomyces 
10 species; Peniciilium species; Microsporum species; Epidermophyton species; Trichophytons 
species; Saccharomyces species, including S. cerevisiae; Schizomyces species, including S. 
pombe; Trichosporon species; Rhodotorula species; and Malassezia species. 

Tests for antifungal activities can be any method known in the art. Such 
methods may include contacting one or more test fungal cells with the potential antifungal 
15 drug and measuring the growth inhibition or death of the fungal cells. A drug which exhibits 
a high rate of killing of the test fungus at low dose is a preferred antifungal drug. In one 
embodiment, the antifungal drug kills 50-75% of the test fungal cells. In another 
embodiment, the antifungal drug kills 75-85% of the test fungal cells. In a preferred 
embodiment, the antifungal drug kills 85-95% of the test fungal cells. In a more preferred 
20 embodiment, the antifungal drug kills 95-99% of the test fungal cells. In a most preferred 
embodiment, the antifungal drug kills 100% of the test fungal cells. In other embodiments 
of the invention, the dose of the drug is in the range of 1-10 nM, 10-100 nM, 100-lOOOnM, 
l-10uM, 10-100uM, or 10-100uM. 

As will be appreciated by one skilled in the art, any target gene may be tested 
25 for its requirement for normal activity of a pathway in order to develop a drug or therapeutic 
directed to the pathway in which that target gene is involved. Further, it will be appreciated 
that targets which are found in one species may also be a target in other species, and may be 
validated by the methods of the invention. 

30 5.6. EXPRESSION OF REPORTER GENES AND/OR TARGET GENES 

The nucleotide sequence coding for reporter gene or target gene of the 
invention or a functionally active analog or fragment or other derivative thereof may be used 
for example for the preparation of an assay in which to screen potential drugs which bind to, 
or enhance, inhibit, or modulate the activity of such a protein, and are described herein 

35 below. In one embodiment, the sequence can be inserted into an appropriate expression 
vector, i.e., a vector which contains the necessary elements for the transcription and 
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translation of the inserted protein-coding sequence. The necessary transcriptional and 
translational signals can also be supplied by the native ergosterol-pathway gene and/or its 
flanking regions. A variety of host-vector systems may be utilized to express the protein- 
coding sequence. These include but are not limited to mammalian cell systems infected with 
5 virus (e.g., vaccinia virus, adenovirus, etc.); insect cell systems infected with virus {e.g., 
baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria transformed 
with bacteriophage, DNA, plasmid DNA, or cosmid DNA. The expression elements of 
vectors vary in their strengths and specificities. Depending on the host-vector system 
utilized, any one of a number of suitable transcription and translation elements may be used. 
10 In yet another embodiment, a fragment of an reporter or target protein comprising one or 
more domains of the reporter or target protein is expressed. 

In a specific embodiment, a vector is used that comprises a promoter operably 
linked to a nucleic acid of a reporter gene or target gene, one or more origins of replication, 
and, optionally, one or more selectable markers {e.g., an antibiotic resistance gene). 
15 In other specific embodiments, the reporter or target protein, fragment, 

analog, or derivative may be expressed as a fusion, or chimeric protein product (comprising 
the protein, fragment, analog, or derivative joined via a peptide bond to a heterologous 
protein sequence (of a different protein)). A chimeric protein may include fusion of the 
reporter or target protein, fragment, analog, or derivative to a second protein or at least a 
20 portion thereof, wherein a portion is one (preferably 10, 15, or 20) or more amino acids of 
said second protein. Such a chimeric product can be made by ligating the appropriate 
nucleic acid sequences encoding the desired amino acid sequences to each other by methods 
known in the art, in the proper coding frame, and expressing the chimeric product by 
methods commonly known in the art. Alternatively, such a chimeric product may be made 
25 by protein synthetic techniques, e.g., by use of a peptide synthesizer. 

The invention provides a method for identifying a molecule that modulates the 
expression of an ergosterol-pathway gene selected from the group consisting of YHR039C 
(as depicted in FIG.2, as set forth in SEQ ID NO:1),YLW100W (as depicted in FIG.4, as set 
forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ID NO:5), 
30 YGR131W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as depicted 
in FIG. 10, as set forth in SEQ ID NO:9), comprising recombinantly expressing in a fungal 
cell one or more candidate molecules, and detecting the expression of said 
ergosterol-pathway gene; wherein an increase or decrease in the gene expression relative to 
the expression in the absence of candidate molecules indicates that the molecules modulates 
35 ergosterol-pathway gene expression. 
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The invention provides a method for identifying a molecule that modulates the 
expression of a PKC-pathway gene selected from the group consisting of SLT2(YHR030C) 
(as depicted in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C (as depicted in 
FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in FIG.21A-B, as 
5 set forth in SEQ ID NO: 1 5), YPK2(YMR104C) (as depicted in FIG.23A-B, as set forth in 
SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID NO:19), and 
ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21), comprising 
recombinantly expressing in a fungal cell one or more candidate molecules, and detecting 
the expression of said PKC-pathway gene; wherein an increase or decrease in the gene 
10 expression relative to the expression in the absence of candidate molecules indicates that the 
molecules modulates PKC-pathway gene expression. 

The invention provides a method for identifying a molecule that modulates the 
expression of an Invasive Growth pathway gene selected from the group consisting of 
KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), 
15 PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 
depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as depicted in 
FIG.35, as set forth in SEQ ID NO:29), comprising recombinantly expressing in a fungal 
cell one or more candidate molecules, and detecting the expression of said Invasive Growth 
pathway gene; wherein an increase or decrease in the gene expression relative to the 
20 expression in the absence of candidate molecules indicates that the molecules modulates 
Invasive Growth pathway gene expression. 

5 7 STRUCTURE OF REPORTER AND/OR 
TARGET GENES AND PROTEINS 

The structure of reporter or target genes and proteins of the invention can be 
analyzed by various methods known in the art. Such analysis may be useful, for example, in 
the design of antifungal or fungicidal agents of the invention. Some examples of such 
methods are described below. 



5.7.1. GENETIC ANALYSIS 

The cloned DNA or cDNA corresponding to a reporter or target gene can be 
analyzed by methods including but not limited to Southern hybridization (Southern, 1975, J. 
Mol. Biol. 98:503-517), Northern hybridization (see e.g., Freeman et al., 1983, Proc. Natl. 
Acad. Sci. U.S.A. 80:4094-4098), restriction endonuclease mapping (Maniatis, 1982, 
Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, New York), and DNA sequence analysis. Accordingly, this invention 
provides for the use of nucleic acid probes recognizing a reporter or target gene. For 
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example, polymerase chain reaction (PCR; U.S. Patent Nos. 4,683,202, 4,683,195 and 
4,889,818; Gyllenstein et aL, 1988, Proc. Natl. Acad. Sci. U.S.A. 85:7652-7656; Ochman et 
al., 1988, Genetics 120:621-623; Loh et al., 1989, Science 243:217-220) followed by 
Southern hybridization with an a reporter or target gene-specific probe can allow the 
5 detection of a reporter or target gene in DNA from various cell types. In one specific 
embodiment, the cell types are from different species within the same phylogenetic 
kingdom. Methods of amplification other than PCR are commonly known and can also be 
employed. In one embodiment, Southern hybridization can be used to determine the genetic 
linkage of a reporter or target gene. Northern hybridization analysis can be used to 

10 determine the expression of a gene assigned to the a particular biological pathway by the 
methods disclosed herein. Various cell types, at various states of development or activity 
can be tested for gene expression. The stringency of the hybridization conditions for both 
Southern and Northern hybridization can be manipulated to ensure detection of nucleic acids 
with the desired degree of relatedness to the specific a reporter or target gene probe used. 

15 Modifications of these methods and other methods commonly known in the art can be used. 

Restriction endonuclease mapping can be used to roughly determine the 
genetic structure of a reporter or target gene. Restriction maps derived by restriction 
endonuclease cleavage can be confirmed by DNA sequence analysis. Restriction 
endonucleases may also be used to digest DNA sequences which are attached to 

20 microarrays. 

DNA sequence analysis can be performed by any techniques known in the 
art, including but not limited to the method of Maxam and Gilbert (1980, Meth. Enzymol. 
65:499-560), the Sanger dideoxy method (Sanger et al., 1977, Proc. Natl. Acad. Sci. U.S.A. 
74:5463), the use of T7 DNA polymerase (Tabor and Richardson, U.S. Patent No. 
25 4,795,699), or use of an automated DNA sequencer (e.g., Applied Biosystems, Foster City, 
California). In a specific embodiment, DNA sequencing is used to confirm the sequence of 
a microarray binding partner or probe. 

5.7.2. PROTEIN ANALYSIS 
30 The amino acid sequence of an ergosterol-pathway protein can be derived by 

deduction from the DNA sequence, or alternatively, by direct sequencing of the protein, e.g., 

with an automated amino acid sequencer. In a preferred embodiment, S. cerevisiae protein 

sequences are obtained thru the Saccharomyces Genome Database 

(www.Stratford.edu/Saccharomyces). 
35 A reporter-gene or target-gene protein sequence can be further characterized 

by a hydrophilicity analysis (Hopp and Woods, 1981, Proc. Natl. Acad. Sci. U.S. A. 
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78:3824). A hydrophilicity profile can be used to identify the hydrophobic and hydropbilic 
regions of the protein encoded by a reporter gene or target gene and the corresponding 
regions of the gene sequence which encode such regions. 

Structural prediction analysis (Chou and Fasman, 1974, Biochemistry 
5 1 3 :222) can also be done, to identify regions of a protein encoded by a reporter gene or 
target gene, that assume specific secondary structures, which may be useful in the design of 
therapeutics which target specific biological-pathway proteins. 

Manipulation, translation, and secondary structure prediction, open reading 
frame prediction and plotting, as well as determination of sequence homologies, can also be 
10 accomplished using computer software programs available in the art. 

Other methods of structural analysis can also be employed. These include but 
are not limited to X-ray crystallography (Engstom, 1974, Biochem. Exp. Biol. 1 1:7-13), 
nuclear magnetic resonance spectroscopy (Clore and Gonenbom, 1989, CRC Crit. Rev. 
Biochem. 24:479-564) and computer modeling (Fletterick and Zoller, 1986, Gomputer 
15 Graphics and Molecular Modeling, in Current Communications in Molecular Biology, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, New York). 

The invention further relates to the use of proteins encoded by reporter genes 
or target genes, derivatives (including but not limited to fragments), analogs, and molecules 
of reporter or target proteins. 
20 The production and use of fragments, derivatives, and analogs related to an 

reporter or target protein are within the scope of the present invention. In a specific 
embodiment, the derivative or analog is functionally active, i.e., capable of exhibiting one or 
more functional activities associated with a full-length, wild-type reporter or target protein. 
As one example, such derivatives or analogs which have the desired re-clustering activity 
25 can be assigned to a biological-pathway. As yet another example, such derivatives or 
analogs which have the desired co-clustering activity can be used for targets for the 
development of drugs directed to such a target, such as an antifungal or fungicidal agent 
directed to a target gene in the ergosterol-pathway. Derivatives or analogs that retain, or 
alternatively lack or inhibit, a desired biological-pathway protein property-of-interest {e.g., 
30 binding to a specific biological pathway protein binding partner), can be used as inducers, or 
inhibitors, respectively, of such property and its physiological correlates. A specific 
embodiment relates to a dominant negative form of an ergosterol-pathway protein fragment 
that can bind and inhibit ergosterol-pathway protein. Derivatives or analogs of an 
ergosterol-pathway protein can be tested for the desired activity by procedures known in the 
35 art, including but not limited to the assays described below. 
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In particular, reporter or target protein derivatives can be made by altering the 
sequences by substitutions, additions (e.g., insertions) or deletions. Due to the degeneracy 
of nucleotide coding sequences, other DNA sequences which encode substantially the same 
amino acid sequence as the reporter or target gene may be used in the practice of the present 
5 invention. These include but are not limited to nucleotide sequences comprising all or 
portions of a reporter or target gene which is altered by the substitution of different codons 
that encode a functionally equivalent amino acid residue within the sequence, thus producing 
a silent change. 

In a specific embodiment of the invention, use of proteins consisting of or 
10 comprising a fragment of reporter or target protein consisting of at least 10 (continuous) 
amino acids of the reporter or target protein is provided. In other embodiments, the 
fragment consists of at least 20 or at least 50 amino acids of the reporter or target protein. In 
specific embodiments, such fragments are not larger than 35, 100 or 200 amino acids. Use 
of derivatives or analogs of reporter or target proteins include but are not limited to those 
15 molecules comprising regions that are substantially homologous to the reporter or target 
protein or fragment thereof (e.g., in various embodiments, at least 60% or 70% or 80% or 
90% or 95% identity over an amino acid sequence of identical size or when compared to an 
aligned sequence in which the alignment is done by a computer homology program known 
in the art) or whose encoding nucleic acid is capable of hybridizing to a coding reporter or 
20 target gene sequence, under high stringency, moderate stringency, or low stringency 
conditions. 

Specifically, by way of example computer programs for determining 
homology may include but are not limited to TBLASTN, BLASTP, FASTA, TFASTA, and 
CLUSTALW (Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85(8):2444-8; 

25 Altschul et aL, 1990, J. Mol. Biol. 215(3):403-10; Thompson, et al., 1994, Nucleic Acids 
Res. 22(22):4673-80; Higgins, et al., 1996, Methods Enzymol 266:383-402; Altschul, et al., 
1990, J. Mol. Biol. 215(3):403-10). 

Specifically, Basic Local Alignment Search Tool (BLAST) 
(www.ncbi.nlm.nih.gov) (Altschul et al., 1990, J. of Molec. Biol., 215:403-410, "The 

30 BLAST Algorithm; Altschul et al., 1997, Nuc. Acids Res. 25:3389-3402) is a heuristic 
search algorithm tailored to searching for sequence similarity which ascribes significance 
using the statistical methods of Karlin and Altschul 1990, Proc. Nat'l Acad. Sci. USA, 
87:2264-68; 1993, Proc. Nat'l Acad. Sci. USA 90:5873-77. Five specific BLAST programs 
perform the following tasks: 1) The BLASTP program compares an amino acid query 

35 sequence against a protein sequence database; 2) The BLASTN program compares a 
nucleotide query sequence against a nucleotide sequence database; 3) The BLASTX 
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program compares the six- frame conceptual translation products of a nucleotide query 
sequence (both strands) against a protein sequence database; 4) The TBLASTN program 
compares a protein query sequence against a nucleotide sequence database translated in all 
six reading frames (both strands); 5) The TBLASTX program compares the six-frame 
5 translations of a nucleotide query sequence against the six-frame translations of a nucleotide 
sequence database. 

Smith- Waterman (database: European Bioinformatics Institute 
wwwz.ebi.ac.uk/bic_sw/) (Smith-Waterman, 1981, J. of Molec. Biol., 147:195-197) is a 
mathematically rigorous algorithm for sequence alignments. 
10 FASTA (see Pearson et al., 1988, Proc. Nat'l Acad. Sci. USA, 85:2444-2448) 

is a heuristic approximation to the Smith-Waterman algorithm. For a general discussion of 
the procedure and benefits of the BLAST, Smith- Waterman and FASTA algorithms see 
Nicholas et al., 1998, "A Tutorial on Searching Sequence Databases and Sequence Scoring 
Methods" (www.psc.edu) and references cited therein. 
15 The reporter or target derivatives and analogs of the invention can be 

produced by various methods known in the art. The manipulations which result in their 
production can occur at the gene or protein level. For example, a cloned reporter or target 
gene sequence can be modified by any of numerous strategies known in the art (Sambrook et 
al., 1989, Molecular Cloning, A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory 
20 Press, Cold Spring Harbor, New York). The sequence can be cleaved at appropriate^sites 
with restriction endonuclease(s), followed by further enzymatic modification if desired, 
isolated, and iigated in vitro. 

Additionally, an reporter or target gene nucleic acid sequence can be mutated 
in vitro or in vivo, to create and/or destroy translation, initiation, and/or termination 
25 sequences, or to create variations in coding regions and/or to form new restriction 

endonuclease sites or destroy preexisting ones, to facilitate further in vitro modification. 
Any technique for mutagenesis known in the art can be used, including but not limited to, 
chemical mutagenesis, in vitro site-directed mutagenesis (Hutchinson et al., 1978, J. Biol. 
Chem. 253:6551), use of TAB® linkers (Pharmacia), PCR with primers containing a 
30 mutation, etc. 

Manipulations of an reporter or target protein sequence may also be made at 
the protein level. Included within the scope of the invention are reporter or target protein 
fragments or other derivatives or analogs which are differentially modified during or after 
translation, e.g., by glycosylation, acetylation, phosphorylation, amidation, derivatization by 
35 known protecting/blocking groups, proteolytic cleavage, linkage to an antibody molecule or 
other cellular ligand, etc. Any of numerous chemical modifications may be carried out by 
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known techniques, including but not limited to specific chemical cleavage by cyanogen 
bromide, trypsin, chymotrypsin, papain, V8 protease, NaBH 4 , acetylation, formylation, 
oxidation, reduction, metabolic synthesis in the presence of tunicamycin, etc. 

In addition, analogs and derivatives of a reporter or target protein can be 
5 chemically synthesized. For example, a peptide corresponding to a portion of a reporter or 
target protein which comprises the desired domain, or which mediates the desired activity in 
vitro * can be synthesized by use of a peptide synthesizer. Furthermore, if desired, 
nonclassical amino acids or chemical amino acid analogs can be introduced as a substitution 
or addition into the reporter or target sequence. Non-classical amino acids include but are 
10 not limited to the D-isomers of the common amino acids, cc-amino isobutyric acid, 4- 

aminobutyric acid, Abu, 2-amino butyric acid, y-Abu, e-Ahx, 6-amino hexanoic acid, Aib, 
2-amino isobutyric acid, 3-amino propionic acid, ornithine, norleucine, norvaline, 
hydroxyproline, sarcosine, citrulline, cysteic acid, t-butylglycine, t-butylalanine, 
phenylglycine, cyclohexylalanine, p-alanine, fluoro-amino acids, designer amino acids such 
15 as P-methyl amino acids, Ccc-methyl amino acids, Noc-methyl amino acids, and amino acid 
analogs in general. Furthermore, the amino acid can be D (dextrorotary) or L (levorotary). 

In a specific embodiment, an reporter or target protein derivative is a 
chimeric or fusion protein comprising a reporter or target protein or fragment thereof 
(preferably consisting of at least a domain or motif of the reporter or target protein, or at 
20 least 10 amino acids of the reporter or target protein) joined at its amino- or 

carboxy-terminus via a peptide bond to an amino acid sequence of a different protein. In 
specific embodiments, the amino acid sequence of the different protein is at least 6, 10, 20 or 
30 continuous amino acids of the different proteins or a portion of the different protein that 
is functionally active. In one embodiment, such a chimeric protein is produced by 
25 recombinant expression of a nucleic acid encoding the protein (comprising an reporter or 
target-coding sequence joined in-frame to a coding sequence for a different protein). Such a 
chimeric product can be made by ligating the appropriate nucleic acid sequences encoding 
the desired amino acid sequences to each other by methods known in the art, in the proper 
coding frame, and expressing the chimeric product by methods commonly known in the art. 
30 Alternatively, such a chimeric product may be made by protein synthetic techniques, e.g., by 
use of a peptide synthesizer. Chimeric genes comprising portions of a reporter or target gene 
fused to any heterologous protein-encoding sequences may be constructed. A specific 
embodiment relates to a chimeric protein comprising a fragment of reporter or target protein 
of at least six amino acids, or a fragjnent that displays one or more functional activities of 
35 the reporter or target protein. 
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5.8. IDENTIFICATION OF COMPOUNDS 
WITH BINDING CAPACITY 

This invention provides screening methodologies useful in the identification 
of proteins and other compounds which bind to, or otherwise directly interact with, the 
reporter or target genes and proteins. Screening methodologies are well known in the art 
The proteins and compounds include endogenous cellular components which interact with 
the identified genes and proteins in vivo and which, therefore, may provide new targets for 
pharmaceutical and therapeutic interventions, as well as recombinant, synthetic, and 
otherwise exogenous compounds which may have binding capacity and, therefore, may be 
candidates for pharmaceutical agents. Thus, in one series of embodiments, cell lysates may 
be screened for proteins or other compounds which bind to one of the normal or mutant 
reporter or target genes and proteins. 

Alternatively, any of a variety of exogenous compounds, both naturally 
occurring and/or synthetic (e.g., libraries of small molecules or peptides), may be screened 
15 for binding capacity . 

As will be apparent to one of ordinary skill in the art, there are numerous 
other methods of screening individual proteins or other compounds, as well as large libraries 
of proteins or other compounds (e.g., phage display libraries) to identify molecules which 
bind to reporter or target proteins of the invention. All of these methods comprise the step 
of mixing a reporter or target protein or fragment with test compounds, allowing time for 
any binding to occur, and assaying for any bound complexes. All such methods are enabled 
by the present disclosure of substantially pure reporter or target proteins, substantially pure 
functional domain fragments, fusion proteins, antibodies, and methods of making and using 
the same. In a specific embodiment, the reporter or target protein is an ergosterol-pathway 
25 protein. In another specific embodiment, the reporter or target protein is a PKC-pathway 
protein. In another specific embodiment, the reporter or target protein is an Invasive 
Growth pathway protein. 

The invention provides a method of identifying a molecule that binds to a 
ligand selected from the group consisting of (i) an S. cerevisiae ergosterol-pathway 
protein selected from the group consisting of YHR039C (as depicted in FIG.3, as set 
forth in SEQ ID NO:2), YLW100W (as depicted in FIG.5, as set forth in SEQ ID 
NO:4), YPL272C (as depicted in FIG.7, as set forth in SEQ ID NO:6), YGR131 W 
(as depicted in FIG.9, as set forth in SEQ ID NO:8), and YDR453C (as depicted in 
FIG.ll, as set forth in SEQ ED NO:10), (ii) a fragment of the S. cerevisiae 
ergosterol-pathway protein, and (iii) a nucleic acid encoding the S. cerevisiae 
ergosterol-pathway protein or fragment, the method comprising:(a) contacting the 
ligand with a plurality of molecules under conditions conducive to binding between 
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the ligand and the molecules; and (b) identifying a molecule within the plurality that 
binds to the ligand. 

The invention provides a method of identifying a molecule that binds to a ligand 
selected from the group consisting of (i) an S. cerevisiae PKC-pathway protein selected from 
5 the group consisting of SLT2(YHR030C) (as depicted in FIG. 1 8, as set forth in SEQ ID 
NO:12), YKR161C (as depicted in FIG.20, as set forth in SEQ ID NO: 14), 
PIR3(YKL163W) (as depicted in FIG.22, as set forth in SEQ ID NO: 16), YPK2(YMR104C) 
(as depicted in FIG.24, as set forth in SEQ ID NO: 18), YLR194C (as depicted in FIG.26, as 
set forth in SEQ ID NO:20), and ST1(YDR055W) (as depicted in FIG.28, as set forth in 

10 SEQ ID NO:22), (ii) a fragment of the S. cerevisiae PKC-pathway protein, and (iii) a nucleic 
acid encoding the S. cerevisiae PKC-pathway protein or fragment, the method 
comprising:(a) contacting the ligand with a plurality of molecules under conditions 
conducive to binding between the ligand and the molecules; and (b) identifying a molecule 
within the plurality that binds to the ligand. 

15 The invention provides a method of identifying a molecule that binds to a ligand 

selected from the group consisting of (i) an S. cerevisiae Invasive Growth pathway protein 
selected from the group consisting of KSS1(YGR040W) (as depicted in FIG.30, as set forth 
in SEQ ID NO:24), PGU1(YJR153W) (as depicted in FIG.32, as set forth in SEQ ID 
NO:26), YRL042C (as depicted in FIG.34, as set forth in SEQ ID NO:28), and 

20 SVS1(YPL163C) (as depicted in FIG.36, as set forth in SEQ ID NO:30), (ii) a fragment of 
the S. cerevisiae Invasive Growth pathway protein, and (iii) a nucleic acid encoding the S. 
cerevisiae Invasive Growth pathway protein or fragment, the method comprising (a) 
contacting the ligand with a plurality of molecules under conditions conducive to binding 
between the ligand and the molecules; and (b) identifying a molecule within the plurality 

25 that binds to the ligand. 

5.8.1. PROTEINS WHICH INTERACT WITH 
PATHWAY-SPECIFIC PROTEINS 

The present invention further provides methods of identifying or screening 
for proteins which interact with reporter or target proteins of a biological pathway of 
interest, or derivatives, fragments, or analogs thereof In specific embodiments, the method 
of identifying a molecule that binds to a ligand (e.g., an ergosterol-pathway protein) 
comprises contacting the ligand with a plurality of molecules under conditions conducive to 
binding between the ligand and the molecules; and identifying a molecule within the 
plurality that binds to the ligand. The ligand or protein in the method can either be a purified 
or non-purified form. Preferably, the method of identifying or screening is a yeast two- 
hybrid assay system or a variation thereof, as further described below. In this regard, the 
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yeast two-hybrid method has been used to analyze protein-protein interactions (see e.g. Zhu 
and Kahn, 1997, Proc. Natl. Acad. Sci. U.S.A. 94:13063-13068). Derivatives (e.g., 
fragments) and analogs of a protein can also be assayed for binding to a binding partner by 
any method known in the art, for example, immunoprecipitation with an antibody that binds 
5 to the protein in a complex followed by analysis by size fractionation of the 

immunoprecipitated proteins (e.g., by denaturing or nondenaturing polyacrylamide gel 
electrophoresis), Western analysis, non-denaturing gel electrophoresis, etc. 

One aspect of the present invention provides methods for assaying and 
screening fragments, derivatives and analogs of reporter or target proteins of the invention 
10 for interacting proteins (e.g., for binding to an S. cerevisiae ergosterol peptide). 

Derivatives, analogs and fragments of proteins that interact with a reporter or target protein 
can preferably identified by means of a yeast two hybrid assay system (Fields and Song, 
1989, Nature 340:245-246; U.S. Patent No. 5,283,173). Because the interactions are 
screened for in yeast, the intermolecular protein interactions detected in this system occur 
15 under physiological conditions that mimic the conditions in eukaryotic cells, including 

vertebrates or invertebrates (Chien et al., 1991, Proc. Natl. Acad. Sci. U.S.A. 88:9578-9581). 
By way of illustration, this feature facilitates identification of proteins capable of interaction 
with an S. cerevisiae ergosterol-pathway protein from species other than S. cerevisiae. 

Identification of interacting proteins by the improved yeast two-hybrid 
20 system is based upon the detection of expression of a "marker" gene, the transcription of 
which is dependent upon the reconstitution of a transcriptional regulator by the interaction of 
two proteins, each fused to one half of the transcriptional regulator. In some embodiments 
of the invention, the "marker" genes as described below, act as a read-out for the interaction 
of two test proteins called the bait and the prey. The "bait" (i.e., a pathway-specific reporter 
25 or target protein of a or derivative or analog thereof) and "prey" (proteins to be tested for 
ability to interact with the bait) proteins are expressed as fusion proteins to a DNA binding 
domain, and to a transcriptional regulatory domain, respectively, or vice versa. In various 
specific embodiments, the prey has a complexity of at least about 50, about 100, about 500, 
about 1,000, about 5,000, about 10,000, or about 50,000; or has a complexity in the range of 
30 about 25 to about 100,000, about 100 to about 100,000, about 50,000 to about 100,000, or 
about 100,000 to about 500,000. For example, the prey population can be one or more 
nucleic acids encoding mutants of a protein (e.g., as generated by site-directed mutagenesis 
or another method of making mutations in a nucleotide sequence). Preferably, the prey 
populations are proteins encoded by DNA, e.g., cDNA or genomic DNA or synthetically- 
35 generated DNA. For example, the populations can be expressed from chimeric genes 
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comprising cDNA sequences from an un-characterized sample of a population of cDNA 
from mRNA. 

One characteristic of the yeast two-hybrid system is that proteins examined in 
this system are expressed as cytoplasmic proteins, and therefore do not pass through the 
5 secretory pathway. However, several methods are incorporated in the present invention to 
examine derivatives of reporter or target proteins of the invention that mimic processed 
forms of these proteins. 

In a specific embodiment, recombinant biological libraries expressing random 
peptides can be used as the source of prey nucleic acids. 
10 In another embodiment, the invention provides methods of screening for 

inhibitors or enhancers of the protein interactants identified herein. Briefly, the protein- 
protein interaction assay can be carried out as described herein, except that it is done in the 
presence of one or more candidate molecules. An increase or decrease in marker gene 
activity relative to that present when the one or more candidate molecules are absent 
15 indicates that the candidate molecule has an effect on the interacting pair. In a preferred 
method, inhibition of the interaction is selected for (i.e., inhibition of the interaction is 
necessary for the cells to survive), for example, where the interaction activates the URA3 
gene, causing yeast to die in medium containing the chemical 5-fluoroorotic acid (Rothstein, 
1983, Meth. EnzymoL 101:167-180). The identification of inhibitors of such interactions 
20 can also be accomplished, for example, but not by way of limitation, using competitive 
inhibitor assays, as described above. 

In general, proteins of the bait and prey populations are provided as fusion 
(chimeric) proteins (preferably by recombinant expression of a chimeric coding sequence) 
comprising each protein contiguous to a pre-selected sequence. For one population, the pre- 
25 selected sequence is a DNA binding domain. The DNA binding domain can be any DNA 
binding domain, as long as it specifically recognizes a DNA sequence within a promoter. 
For example, the DNA binding domain is of a transcriptional activator or inhibitor. For the 
other population, the pre-selected sequence is an activator or inhibitor domain of a 
transcriptional activator or inhibitor, respectively. The regulatory domain alone (not as a 
30 fusion to a protein sequence) and the DNA-binding domain alone (not as a fusion to a 
protein sequence) preferably do not detectably interact (so as to avoid false positives in the 
assay). The assay system further includes a reporter gene operably linked to a promoter that 
contains a binding site for the DNA binding domain of the transcriptional activator (or 
inhibitor). 

35 Accordingly, in the present method of the invention, binding of a bait fusion 

protein containing a reporter or target protein of the invention (such as an S. cerevisiae 
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ergosterol-pathway protein) to a prey fusion protein leads to reconstitution of a 
transcriptional activator (or inhibitor) which activates (or inhibits) expression of the marker 
gene. The activation (or inhibition) of transcription of the marker gene occurs 
intracellular^, e.g., in prokaryotic or eukaryotic cells, preferably in cell culture. 
5 The promoter that is operably linked to the marker gene nucleotide sequence 

can be a native or non-native promoter of the nucleotide sequence, and the DNA binding 
site(s) that are recognized by the DNA binding domain portion of the fusion protein can be 
native to the promoter (if the promoter normally contains such binding site(s)) or non-native 
to the promoter. Thus, for example, one or more tandem copies (e.g., four or five copies) of 
10 the appropriate DNA binding site can be introduced upstream of the TATA box in the 
desired promoter (e.g., in the area of about position -100 to about -400). In a preferred 
aspect, 4 or 5 tandem copies of the 17 bp UAS (GAL4 DNA binding site) are introduced 
upstream of the TATA box in the desired promoter, which is upstream of the desired coding 
sequence for a selectable or detectable marker. In a preferred embodiment, the GAL1-10 
15 promoter is operably fused to the desired nucleotide sequence; the GAL1-10 promoter 
already contains 4 binding sites for GALA 

Alternatively, the transcriptional activation binding site of the desired gene(s) 
can be deleted and replaced with GAL4 binding sites (Bartel et al., 1993, BioTechniques 
14:920-924; Chasman et al., 1989, Mol. Cell. Biol. 9:4746-4749). The marker gene 
20 preferably contains the sequence encoding a detectable or selectable marker, the expression 
of which is regulated by the transcriptional activator, such that the marker is either turned on 
or off in the cell in response to the presence of a specific interaction. Preferably, the assay is 
carried out in the absence of background levels of the transcriptional activator (e.g., in a cell 
that is mutant or otherwise lacking in the transcriptional activator). 
25 In one embodiment, more than one marker gene is used to detect 

transcriptional activation, e.g., one marker gene encoding a detectable marker and one or 
more marker genes encoding different selectable markers. The detectable marker can be 
any molecule that can give rise to a detectable signal, e.g., a fluorescent protein or a protein 
that can be readily visualized or that is recognizable by a specific antibody. The selectable 
30 marker can be any protein molecule that confers the ability to grow under conditions that do 
not support the growth of cells not expressing the selectable marker, e.g., the selectable 
marker is an enzyme that provides an essential nutrient and the cell in which the interaction 
assay occurs is deficient in the enzyme and the selection medium lacks such nutrient. The 
marker gene can either be tinder the control of the native promoter that naturally contains a 
35 binding site for the DNA binding protein, or under the control of a heterologous or synthetic 
promoter. 
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The activation domain and DNA binding domain used in the assay can be 
from a wide variety of transcriptional activator proteins, as long as these transcriptional 
activators have separable binding and transcriptional activation domains. For example, the 
GAL4 protein of 5. cerevisiae (Ma et aL, 1987, Cell 48:847-853), the GCN4 protein of S. 
5 cerevisiae (Hope and Struhl, 1986, Cell 46:885-894), the ARD1 protein of S. cerevisiae 
(Thukral et aL, 1989, Mol. Cell. Biol. 9:2360-2369), and the human estrogen receptor 
(Kumar et aL, 1987, Cell 51:941-951), have separable DNA binding and activation domains. 
The DNA binding domain and activation domain that are employed in the fusion proteins 
need not be from the same transcriptional activator. In a specific embodiment, a GAL4 or 
10 LEXA DNA binding domain is employed. In another specific embodiment, a GAL4 or 
herpes simplex virus VP16 (Triezenberg et aL, 1988, Genes Dev. 2:730-742) activation 
domain is employed. In a specific embodiment, amino acids 1-147 of GAL4 (Ma et aL, 
1987, Cell 48:847-853; Ptashne et aL, 1990, Nature 346:329-331) is the DNA binding 
domain, and amino acids 411-455 of VP16 (Triezenberg et aL, 1988, Genes Dev. 2:730-742; 
15 Cress et aL, 1991, Science 251:87-90) comprise the activation domain. 

In a preferred embodiment, the yeast transcription factor GAL4 is 
reconstituted by protein-protein interaction and the host strain is mutant for GAL4. In 
another embodiment, the DNA-binding domain is Ace IN and/or the activation domain is 
Acel, the DNA binding and activation domains of the Acel protein, respectively. Acel is a 
20 yeast protein that activates transcription from the CUP J operon in the presence of divalent 
copper. CUP J encodes metallothionein, which chelates copper, and the expression of CUP 1 
protein allows growth in the presence of copper, which is otherwise toxic to the host cells. 
The marker gene can also be a CUPl-lacZ fusion that expresses the enzyme beta- 
galactosidase (detectable by routine chromogenic assay) upon binding of a reconstituted 
25 AcelN transcriptional activator (see Chaudhuri et aL, 1995, FEBS Letters 357:221-226). In 
another specific embodiment, the DNA binding domain of the human estrogen receptor is 
used, with a marker gene driven by one or three estrogen receptor response elements (Le 
Douarin et aL, 1995, Nucl. Acids. Res. 23:876-878). 

The DNA binding domain and the transcriptional activator/inhibitor domain 
30 each preferably has a nuclear localization signal (see Ylikomi et aL, 1992, EMBO J. 
11:3681-3694; Dingwall and Laskey, 1991, TIBS 16:479-481) functional in the cell in 
which the fusion proteins are to be expressed. 

To facilitate isolation of the encoded proteins, the fusion constructs can 
further contain sequences encoding affinity tags such as glutathione-S-transferase or 
35 maltose-binding protein or an epitope of an available antibody, for affinity purification (e.g., 
binding to glutathione, maltose, or a particular antibody specific for the epitope, 
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respectively) (Allen et al., 1995, TIBS 20:51 1-516). In another embodiment, the fusion 
constructs further comprise bacterial promoter sequences for recombinant production of the 
fusion protein in bacterial cells. 

The host cell in which the interaction assay occurs can be any cell, 
5 prokaryotic or eukaryotic, in which transcription of the marker gene can occur and be 
detected, including, but not limited to, mammalian (e.g., monkey, mouse, rat, human, 
bovine), chicken, bacterial, or insect cells, and is preferably a yeast cell. Expression 
constructs encoding and capable of expressing the binding domain fusion proteins, the 
transcriptional activation domain fusion proteins, and the marker gene produces) are 
10 provided within the host cell, by mating of cells containing the expression constructs, or by 
cell fusion, transformation, electroporation, microinjection, etc. The host cell used should 
not express an endogenous transcription factor that binds to the same DNA site as that 
recognized by the DNA binding domain fusion population. Also, preferably, the host cell is 
mutant or otherwise lacking in an endogenous, functional form of the marker gene(s) used in 
15 the assay. Various vectors and host strains for expression of the two fusion protein 

populations in yeast are known and can be used {see e.g., U.S. Patent No. 5,1468,614; Bartel 
et al., 1993, "Using the two-hybrid system to detect protein-protein interactions" In Cellular 
Interactions in Development, Hartley, ed., Practical Approach Series xviii, IRL Press at 
Oxford University Press, New York, NY, pp. 153-179; Fields and Stemglanz, 1994, Trends 
20 hi Genetics 10:286-292). By way of example but not limitation, yeast strains or derivative 
strains made therefrom, which can be used are N105,N106, N1051, N1061, and YULH. 
Other exemplary strains that can be used in the assay of the invention also include, but are 

not limited to, the following: 

Y190: MATa, ura3-52, his3-200, lys2-801, ade2-101, trpl-901, Ieu2-3,U2, 
25 gal4a, gal80a, cyK2, LYS2::GALl UAS -HIS3 TATA HIS3, URA3::GALl UAS -GALl TATA -lacZ\ Harper 
et al., 1993, Cell 75:805-816, available from Clontech, Palo Alto, California. Y190 contains 
HIS3 and lacZ marker genes driven by GAL4 binding sites. 

CG-1945: MATa, ura3-52, his3-200, lys2-801, ade2-101, trpl-901, leu2- 
3,112, gal4-542, gal80-538, cyh T 2, LYS2::GALl UAS -HIS3 TATA HIS3, URA3::GALl UASl7mm(x3) - 
30 CYCl TATA -lacZ, available from Clontech, Palo Alto, California. CG-1945 contains HIS3 and 
lacZ marker genes driven by GAL4 binding sites. 

Y187: MAT-a, ura3-52, his3-200,ade2-101, trpl-901, leu2-3,112, gal4cc, 
gal80a, URA3::GALl UAS -GALl TATA -lacZ, available from Clontech, Palo Alto, California. 
Y187 contains a lacZ marker gene driven by GAL4 binding sites. 

35 
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SFY526: MATa, ura3-52, his3-200 9 lys2-801 9 ade2-101 9 trpl-901 9 leu2- 
3,112, gal4-542 9 gal80-538 9 can\ URA3::GALl-lacZ 9 available from Clontech, Palo Alto, 
California. SFY526 contains HIS3 and lacZ marker genes driven by GAL4 binding sites. 

HF7c: MATa, ura3-52, his3-200, lys2-801 9 ade2-101 9 trpl-901, leu2-3,112 9 
5 gal4-542 9 gal80-538 9 LYS2::GAL1-HIS3 9 URA3::GAL1 UAS tmu^^fCYClAacZ, available 
from Clontech, Palo Alto, California. HF7c contains HIS3 and lacZ marker genes driven by 
GAL4 binding sites. 

YRG-2: MATa, ura3-52 9 his3-200 9 lys2-801 9 ade2-101 9 trpl-901, leu2- 
3J12 9 gal4-542 9 gal80-538 9 LYS2::GALl UAsr GALl TATA -HIS3 9 URA3::GAL1 UASl7mer , (x3r CYCl- 
10 lacZ 9 available from Stratagene, La Jolla, California. YRG-2 contains HIS3 and lacZ 
marker genes driven by GAL4 binding sites. Many other strains commonly known and 
available in the art can be used. 

If not already lacking in endogenous marker gene activity, cells mutant in the 
marker gene may be selected by known methods, or the cells can be made mutant in the 
15 marker gene by known gene-disruption methods prior to introducing the marker gene 
(Rothstein, \9%3 9 Meth. Enzymol. 101:202-211). 

In a specific embodiment, plasmids encoding the different fusion protein 
populations can be introduced simultaneously into a single host cell (e.g., a haploid yeast 
cell) containing one or more marker genes, by co-transformation, to conduct the assay for 
20 protein-protein interactions. Or, preferably, the two fusion protein populations are 

introduced into a single cell either by mating (e.g., for yeast cells) or cell fusions (e.g., of 
mammalian cells). In a mating type assay, conjugation of haploid yeast cells of opposite 
mating type that have been transformed with a binding domain fusion expression construct 
(preferably a plasmid) and an activation (or inhibitor) domain fusion expression construct 
25 (preferably a plasmid), respectively, will deliver both constructs into the same diploid cell. 
The mating type of a yeast strain may be manipulated by transformation with the HO gene 
(Herskowitz and Jensen, 1991, Meth. Enzymol 194:132-146). 

In a preferred embodiment, a yeast interaction mating assay is employed 
using two different types of host cells, strain-type a and alpha of the yeast Saccharomyces 
30 cerevisiae. The host cell preferably contains at least two marker genes, each with one or 
more binding sites for the DNA-binding domain (e.g., of a transcriptional activator). The 
activator domain and DNA binding domain are each parts of chimeric proteins formed from 
the two respective populations of proteins. One strain of host cells, for example the a strain, 
contains fusions of the library of nucleotide sequences with the DNA-binding domain of a 
35 transcriptional activator, such as GAL4. The hybrid proteins expressed in this set of host 
cells are capable of recognizing the DNA-binding site in the promoter or enhancer region in 
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the marker gene construct. The second set of yeast host cells, for example, the alpha strain, 
contains nucleotide sequences encoding fusions of a library of DNA sequences fused to the 
activation domain of a transcriptional activator. 

In a preferred embodiment, the fusion protein constructs are introduced into 
5 the host cell as a set of plasmids. These plasmids are preferably capable of autonomous 
replication in a host yeast cell and preferably can also be propagated in E. coli. The plasmid 
contains a promoter directing the transcription of the DNA binding or activation domain 
fusion genes, and a transcriptional termination signal. The plasmid also preferably contains 
a selectable marker gene, permitting selection of cells containing the plasmid. The plasmid 
10 can be single-copy or multi-copy. Single-copy yeast plasmids that have the yeast 

centromere may also be used to express the activation and DNA binding domain fusions 
(Elledge et al., 1988, Gene 70:303-312). 

In another embodiment, the fusion constructs are introduced directly into the 
yeast chromosome via homologous recombination. The homologous recombination for 
15 these purposes is mediated through yeast sequences that are not essential for vegetative 
growth of yeast, e.g., the MER2, MER1, ZIPI, REC102, or ME14 gene. 

Bacteriophage vectors can also be used to express the DNA binding domain 
and/or activation domain fusion proteins. Libraries can generally be prepared faster and 
more easily from bacteriophage vectors than from plasmid vectors. 
20 In a specific embodiment, the present invention provides a method of 

detecting one or more protein-protein interactions combined with a negative selection step as 
described in PCT International Publication No. W097/47763, published December 18, 
1997, which is incorporated by reference herein in its entirety. 

In a preferred embodiment, the bait S. cerevisiae ergosterol sequence and the 
25 prey library of chimeric genes are combined by mating the two yeast strains on solid media, 
such that the resulting diploids contain both kinds of chimeric genes, i.e., the DNA-binding 
domain fusion and the activation domain fusion. 

Preferred marker genes include the URA3, HIS3 and/or the lacZ genes {see 
e.g., Rose and Botstein, 1983, Meth. Enzymol. 101:167-180) operably linked to GAL4 DNA- 
30 binding domain recognition elements. Other marker genes include but are not limited to, 
Green Fluorescent Protein (GFP) (Cubitt et al., 1995, Trends Biochem. Sci. 20:448-455), 
luciferase, LEU2, LYS2, ADE2, TRP1, CAN], CYH2, GUS, CUP! or chloramphenicol acetyl 
transferase (CAT). Expression of the marker genes can be detected by techniques known in 
the art {see e.g. PCT International Publication No. W097/47763, published December 18, 
35 1997, which is incorporated by reference herein in its entirety). 
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In a specific embodiment, transcription of the marker gene is detected by a 
linked replication assay. For example, as described by Vasavada et al., 1991, Proc. Natl. 
Acad. Sci. U.S.A. 88:10686-10690, expression of SV40 large T antigen is under the control 
of the E1B promoter responsive to GAL4 binding sites. The replication of a plasmid 
5 containing the SV40 origin of replication, indicates a protein-protein interaction. 

Alternatively, a polyoma virus replicon can be used (Vasavada et al., 1991, Proc, Natl. 
Acad. Sci. U.S.A. 88:10686-90). 

In another embodiment, the expression of marker genes that encode proteins 
can be detected by immunoassay, i.e., by detecting the immunospecific binding of an 
10 antibody to such protein, which antibody can be labeled, or incubated with a labeled binding 
partner to the antibody, to yield a detectable signal. Alam and Cook disclose non-limiting 
examples of detectable marker genes that can be operably linked to a transcriptional 
regulatory region responsive to a reconstituted transcriptional activator, and thus used as 
marker genes (Alam and Cook, 1990, Anal. Biochem. 188:245-254). 
15 The activation of marker genes like URA3 or HIS3 enables the cells to grow 

in the absence of uracil or histidine, respectively, and hence serves as a selectable marker. 
Thus, after mating, the cells exhibiting protein-protein interactions are selected by the ability 
to grow in media lacking a nutritional component, such as uracil or histidine (see Le Douarin 
et al., 1995, Nucl Acids Res. 23:876-878; Durfee et al., 1993, Genes Dev. 7:555-569; 
20 Pierrat et al., 1992, Gene 119:237-245; Wolcott et al., 1966, Biochem. Biophys. Acta 

122:532-534). In other embodiments of the present invention, the activities of the marker 
genes like GFP or lacZ are monitored by measuring a detectable signal (e.g. 9 fluorescent or 
chromogenic, respectively) that results from the activation of these marker genes. LacZ 
transcription, for example, can be monitored by incubation in the presence of a substrate, 
25 such as X-gal (5-bromo-4-chloro-3-indolyl-p-D-galactoside), of its encoded enzyme, P- 
galactosidase. The pool of all interacting proteins isolated by this manner from mating the 
S. cerevisiae ergosterol-pathway sequence product and the library identifies the "ergosterol- 
pathway interactive population". 

In a preferred embodiment of the present invention, false positives arising 
30 from transcriptional activation by the DNA binding domain fusion proteins in the absence of 
a transcriptional activator domain fusion protein are prevented or reduced by negative 
selection prior to exposure to the activation domain fusion population (see e.g. PCT 
International Publication No. W097/47763, published December 18, 1997, which is 
incorporated by reference herein in its entirety). By way of example, if such cell contains 
35 URA3 as a marker gene, negative selection is carried out by incubating the cell in the 
presence of 5-fluoroorotic acid (5-FOA, which kills URA+ cells (Rothstein, 1983, Meth. 
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Enzymol. 101:167-180). Hence, the metabolism of 5-FOA will lead to cell death of self- 
activating DNA-binding domain hybrids. 

In a preferred aspect, negative selection involving a selectable marker as a 
marker gene can be combined with the use of a toxic or growth inhibitory agent to allow a 

5 higher rate of processing than other methods. Negative selection can also be carried out on 
the activation domain fusion population prior to interaction with the DNA binding domain 
fusion population, by similar methods, either alone or in addition to negative selection of the 
DNA binding fusion population. Negative selection can be carried out on the recovered 
protein-protein complex by known methods {see e.g., Bartel et al., 1993, BioTechniques 

10 14:920-924; PCT International Publication No. W097/47763, published December 18, 
1997). 

In a preferred embodiment of the invention the DNA sequences encoding the 
pairs of interactive proteins are isolated by a method wherein either the DNA-binding 
domain hybrids or the activation domain hybrids are amplified, in separate respective 

15 reactions. Preferably, the amplification is carried out by polymerase chain reaction (PCR) 
{see U.S. Patent Nos. 4,683,202; 4,683,195; and 4,889,818; Gyllenstein et al., 1988, Proc. 
Natl. Acad. Sci. U.S.A. 85:7652-7656; Ochman et al., 1988, Genetics 120:621-623; Loh et 
al., 1989, Science 243:217-220; Innis et al., 1990, PCR Protocols, Academic Press, Inc., San 
Diego, California) using pairs of oligonucleotide primers specific for either the DNA- 

20 binding domain hybrids or the activation domain hybrids. Other amplification methods 
known in the art can be used, including but not limited to ligase chain reaction {see EP 
320,308), use of Qp replicase, or methods listed in Kricka et al., 1995, Molecular Probing, 
Blotting, and Sequencing, Academic Press, New York, Chapter 1 and Table IX. 

The plasmids encoding the DNA-binding domain hybrid and the activation 

25 domain hybrid proteins can also be isolated and cloned by any of the methods well known in 
the art. For example, but not by way of limitation, if a shuttle (yeast to E. coli) vector is 
used to express the fusion proteins, the genes can be recovered by transforming the yeast 
DNA into E. coli and recovering the plasmids from E. coli {see e.g., Hoffman et al., 1987, 
Gene 57:267-272). Alternatively, the yeast vector can be isolated, and the insert encoding 

30 the fusion protein subcloned into a bacterial expression vector, for growth of the plasmid in 
E. coli. 



5.9. BIOCHEMICAL ASSAYS USING 

REPORTER OR TARGET PROTEINS 

The present invention provides for biochemical assays using the reporter or 
target proteins of the invention. In a specific embodiment, S. cerevisiae ergosterol-pathway 
proteins are useful for biochemical assays aimed at the identification and characterization of 
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S. cerevisiae substrates or binding partners or the identification of ligands for ergosterol- 
pathway proteins that are yet to be assigned to the pathway. For any of the reporter or target 
genes of the invention, the cDNAs encoding reporter or target proteins can be individually 
subcloned into any of a large variety of eukaryotic expression vectors permitting expression 

5 in fungal, yeast, plant, insect, worm, mammalian, or other cell, as described above. The 
resulting genetically engineered cell lines expressing reporter or target proteins can be 
assayed for production, processing, and degradation of the reporter or target proteins, for 
example with antibodies to a specific reporter or target proteins, such as to an S. cerevisiae 
ergosterol-pathway protein, and Western blotting assays, or ELISA assays. For assays of 

10 specific binding and functional activation of binding-partner proteins, one can employ either 
crude culture medium or extracts containing secreted protein from genetically engineered 
cells (devoid of other ergosterol-pathway proteins), or partially purified culture medium or 
extracts, or preferably highly purified reporter or target protein fractionated, for example, by 
chromatographic methods. Alternatively, a reporter or target protein can be synthesized 

15 using chemical methods (Nagata, et al., 1992, peptides 13(4):653-62). 

Specific protein binding of a reporter or target proteins to the reporter or 
target binding partners or substrates can be assayed as follows, for example, following the 
procedures of Yamaguchi et al. (Yamaguchi et al., 1995, Biochemistry 34:4962-4968). 
Chinese hamster ovary cells, COS cells, or any other suitable cell line, can be transiently 

20 transfected or stably transformed with expression constructs that direct the production of the 
reporter or target protein binding-partner or substrate. Direct binding of a reporter or target 
protein to such binding-partner or substrate-expressing cells can be measured using a 
"labeled" purified reporter or target protein derivative, where the label is typically a 
chemical or protein moiety covalently attached to the reporter or target polypeptide which 

25 permits the experimental monitoring and quantitation of the labeled reporter or target protein 
in a complex mixture. 

Specifically, the label attached to the reporter or target protein can be a 
radioactive substituent such as an l25 I-moiety or 32 P-phosphate moiety, a fluorescent 
chemical moiety, or labels which allow for indirect methods of detection such as a biotin- 

30 moiety for binding by avidin or streptavidin, an epitope-tag such as a Myc- or FLAG-tag, or 
a protein fusion domain which allows for direct or indirect enzymatic detection such as an 
alkaline phosphatase-fusion or Fc-fusion domain. Such labeled reporter or target proteins 
can be used to test for direct and specific binding to binding-partner or substrate-expressing 
cells by incubating the labeled reporter or target protein with the binding-partner or 

35 substrate-expressing cells in serum-free medium, washing the cells with ice-cold phosphate 
buffered saline to remove unbound reporter or target protein, lysing the cells in buffer with 
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an appropriate detergent, and measuring label in the lysates to determine the amount of 
bound reporter or target protein. Alternatively, in place of whole cells, membrane fractions 
or cell lysates obtained from binding-partner or substrate-expressing cells may also be used. 
Also, instead of a direct binding assay, a competition binding assay may be used. For 

5 example, crude extracts or purified reporter or target protein (such as an S. cerevisiae 
ergosterol-pathway protein) can be used as a competitor for binding of labeled purified 
reporter or target binding-partner or substrate-expressing cells, by adding increasing 
concentrations of reporter or target protein to the mixture. The specificity and affinity of 
binding of the reporter or target protein can be judged by comparison with other reporter or 

10 target proteins tested in the same assay. 

5.9.1. IDENTIFICATION OF ADDITIONAL 
BINDING-PARTNERS 



15 



The invention described herein provides for methods in which reporter or 
target proteins are used for the identification of novel reporter or target protein binding- 
partners, using biochemical methods well known to those skilled in the art for detecting 
specific protein-protein interactions (Current Protocols in Protein Science, 1998, Coligan et 
al., eds., John Wiley & Sons, Inc., Somerset, New Jersey). In particular, it is possible that 
some reporter or target proteins interact with binding-partners that have not yet been 
discovered, or binding-partners that are specific to a particular organism (e.g., fungi). The 
identification of either novel binding-partners or specific binding-partners is of great interest 
with respect to human therapeutic applications, such as, for example, antifungal 
applications. By way of example, the novel cognate binding-partners for ergosterol-pathway 
proteins can be investigated and identified as follows. Labeled S. cerevisiae ergosterol- 
2^ pathway proteins can be used for binding assays in situ to identify cells possessing cognate 
binding-partners, for example as described elsewhere (Gorczyca et al., 1993, J. Neurosci. 
13:3692-3704). Also, labeled S. cerevisiae ergosterol-pathway proteins can be used to 
identify specific binding proteins including binding-partner proteins by affinity 
chromatography of S. cerevisiae protein extracts using resins, beads, or chips with bound S. 
cerevisiae ergosterol-pathway protein (Formosa, et al., 1991, Methods Enzymol 208:24-45; 
Formosa, et al., 1983, Proc. Natl. Acad. Sci. USA 80(9):2442-6). Further, specific 
ergosterol-binding proteins can be identified by cross-linking of radioactively-labeled or 
epitope-tagged ergosterol-pathway protein to specific binding proteins in lysates, followed 
by electrophoresis to identify and isolate the cross-linked protein species (Ransone, 1995, 
35 Methods Enzymol 254:491-7). Still further, molecular cloning methods can be used to 
identify novel binding-partners and binding proteins for S. cerevisiae ergosterol-pathway 
proteins including expression cloning of specific binding-partners using S. cerevisiae cDNA 

-83- 



BNSDOCID: <WO 0058520A1_I_> 



WO 00/58520 



PCT/USOO/08555 



expression libraries transfected into mammalian cells, expression cloning of specific binding 
proteins using S. cerevisiae cDNA libraries expressed in E. coli (Cheng and Flanagan, 1994, 
Cell 79(1): 157-68), and yeast two-hybrid methods (as described above) using an S. 
cerevisiae ergosterol-pathway protein fusion as a "bait" for screening activation-domain 
5 fusion libraries derived from S. cerevisiae cDNA (Young and Davis, 1983, Science 222:778- 
82; Young and Davis, 1983, Proc. Natl. Acad. ScL USA 80(5):1 194-8; Sikela and Hahn, 
1987, Proc. Natl. Acad. Sci. USA 84(9):3038-42; Takemoto, et aL, 1997, DNA Cell Biol 
16(6):797-9). 

10 5.9.2, ASS AYS OF PATHWAY PROTEINS 

The functional activity of reporter or target proteins, derivatives and analogs 
can be assayed by various methods known to one skilled in the art. 

For example, in one embodiment, where one is assaying for the ability to 
bind to or compete with a wild-type reporter or target protein for binding to an antibody 

15 directed to the specific reporter or target protein, various immunoassays known in the art can 
be used, including but not limited to competitive and non-competitive assay systems using 
techniques such as radioimmunoassays, ELISA (enzyme linked immunosorbent assay), 
"sandwich" immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, 
immunodiffusion assays, in situ immunoassays (e.g., using colloidal gold, enzyme or 

20 radioisotope labels), western blots, precipitation reactions, agglutination assays (e.g., gel 
agglutination assays, hemagglutination assays), complement fixation assays, 
immunofluorescence assays, protein A assays, and Immunoelectrophoresis assays, etc. In 
one embodiment, antibody binding is detected by detecting a label on the primary antibody. 
In another embodiment, the primary antibody is detected by detecting binding of a 

25 secondary antibody or reagent to the primary antibody. In a further embodiment, the 

secondary antibody is labeled. Many means are known in the art for detecting binding in an 
immunoassay and are within the scope of the present invention. In another embodiment, 
where a reporter or target protein is identified, the binding can be assayed, e.g., by means 
well-known in the art. In another embodiment, physiological correlates of reporter or target 

30 protein binding to its substrates and/or binding-partners {e.g., signal transduction) can be 
assayed. 

In another embodiment, using insect (e.g., Sf9 cells), fly (e.g., D. 
melanogaster), or other model systems (such as other yeast or fungal systems, e.g., S. 
pombe), genetic studies can be done to study the phenotypic effect of a particular reporter or 
35 target gene mutant that is a derivative or analog of a wild-type reporter or target gene. Other 
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such methods will be readily apparent to the skilled artisan and are within the scope of the 
invention. 

The invention provides a method for identifying a molecule that activates the 
ergosterol pathway in yeast comprising contacting a yeast cell with one or more candidate 

5 molecules, and detecting a change in the RNA expression of a reporter gene for the 
ergosterol-pathway relative to the expression of the reporter gene in a yeast cell not 
contacted by the one or more candidate molecules, wherein the reporter gene is selected 
from the group consisting of: YHR039C (as depicted in FIG.2, as set forth in SEQ ID 
NO:1),YLW100W (as depicted in FIG.4, as set forth in SEQ ID NO:3),YPL272C (as 

10 depicted in FIG.6, as set forth in SEQ ID NO:5), YGR131 W (as depicted in FIG.8, as set 
forth in SEQ ID NO:7), and YDR453C (as depicted in FIG. 10, as set forth in SEQ ID 
NO:9). 

The invention provides a method for identifying a molecule that activates the 
ergosterol pathway in yeast comprising contacting a yeast cell with one or more candidate 

15 molecules, and detecting a change in the protein expression of a reporter gene for the 
ergosterol-pathway relative to the expression of the reporter gene in a yeast cell not 
contacted by the one or more candidate molecules, wherein the reporter gene is selected 
from the group consisting of: YHR039C (as depicted in FIG.2, as set forth in SEQ ID 
NO:1),YLW100W (as depicted in FIG.4, as set forth in SEQ ID NO:3),YPL272C (as 

20 depicted in FIG.6, as set forth in SEQ ID NO:5), YGR131W (as depicted in FIG.8, as set 
forth in SEQ ID NO:7), and YDR453C (as depicted in FIG. 10, as set forth in SEQ ED 
NO:9). 

The invention provides a method for identifying a molecule that activates the PKC 
pathway in yeast comprising contacting a yeast cell with one or more candidate molecules, 
25 and detecting a change in the RNA expression of a reporter gene for the PKC-pathway 

relative to the expression of the reporter gene in a yeast cell not contacted by the one or more 
candidate molecules, wherein the reporter gene is selected from the group consisting of: 
SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C (as 
depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in 
30 FIG.21 A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted in FIG.23A-B, 
as set forth in SEQ ID NO: 17), YLRf94C (as depicted in FIG.25A-B, as set forth in SEQ ID 
NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21). 

The invention provides a method for identifying a molecule that activates the PKC 
pathway in yeast comprising contacting a yeast cell with one or more candidate molecules, 
35 and detecting a change in the protein expression of a reporter gene for the PKC-pathway 
relative to the expression of the reporter gene in a yeast cell not contacted by the one or more 
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candidate molecules, wherein the reporter gene is selected from the group consisting of: 
SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:ll), YKR161C (as 
depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in 
FIG.21A-B, as set forth in SEQ ID NO: 15), YPK2(YMR104C) (as depicted in FIG.23A-B, 
5 as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID 
NO: 19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21). 

The invention provides a method for identifying a molecule that activates the 
Invasive Growth pathway in yeast comprising contacting a yeast cell with one or more 
candidate molecules, and detecting a change in the RNA expression of a reporter gene for 
10 the Invasive Growth pathway relative to the expression of the reporter gene in a yeast cell 
not contacted by the one or more candidate molecules, wherein the reporter gene is selected 
from the group consisting of KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ 
ID NO:23), PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), 
YRL042C (as depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as 
15 depicted in FIG.35, as set forth in SEQ ID NO:29). 

The invention provides a method for identifying a molecule that activates the 
Invasive Growth pathway in yeast comprising contacting a yeast cell with one or more 
candidate molecules, and detecting a change in the protein expression of a reporter gene for 
the Invasive Growth pathway relative to the expression of the reporter gene in a yeast cell 
20 not contacted by the one or more candidate molecules, wherein the reporter gene is selected 
from the group consisting of: KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ 
ID NO:23), PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), 
YRL042C (as depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as 
depicted in FIG.35, as set forth in SEQ ID NO:29). 

25 

5.9.3. PROLIFERATION & CELL CYCLE ASSAYS 
A reporter or target gene, such as those of the invention may have potential 
implications in the ability of a cell to proliferate. The present invention provides for cell 
cycle and cell proliferation analysis by a variety of techniques known in the art, including 
30 but not limited to the following: 

Bromodeoxyuridine (BRDU) incorporation may be used as an assay to 
identify proliferating cells. The BRDU assay identifies a cell population undergoing DNA 
synthesis by incorporation of BRDU into newly-synthesized DNA. Newly-synthesized 
DNA may then be detected using an anti-BRDU antibody (see Hoshino et al 9 1986, Int. J. 
35 Cancer 38, 369; Campana et ah, 1988, J. Immunol. Meth. 107, 79). 
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Cell Proliferation may also be examined using [ 3 H]-thymidine incorporation 
(see e.g., Chen, J., 1996, Oncogene 13:1395-403; Jeoung, J., 1995, J. Biol. Chem. 
270:18367-73). This assay allows for quantitative characterization of S-phase DNA 
snythesis. In this assay, cells synthesizing DNA will incorporate[ 3 H]-thymidine into newly 
5 synthesized DNA. Incorporation can then me measured by standard techniques in the art 
such as by counting of radioisotope in a Scintillation counter {e.g. Beckman LS 3800 Liquid 

Scintillation Counter). 

Cell proliferation may be measured by the counting samples of a cell 

population over time (e.g. daily cell counts). Cells may be counted using a hemacytometer 
10 and light microscopy (e.g. HyLite hemacytometer, Hausser Scientific). Cell number may be 
plotted against time in order to obtain a growth curve for the population of interest. In a 
preferred embodiment, cells counted by this method are first mixed with the dye Trypan- 
blue (Sigma), such that living cells exclude the dye, and are counted as viable members of 
the population. Alternatively, cells in a liquid solution may be counted by absorbency 
15 techniques known in the art. 

DNA content and/or mitotic index of the cells may be measured, for example, 
based on the DNA ploidy value of the cell. For example, cells in the Gl phase of the cell 
cycle generally contain a 2N DNA polidy value. Cells in which DNA has been replicated 
but have not progressed thru mitosis (e.g. cells in S-phase) will exhibit polidy value higher 
20 than 2N and up to 4N DNA content. Ploidy value and cell cycle kinetics may further be 
measured using propidum iodide assay (see e.g. Turner, T., et aL, 1998, Prostate 34:175-81). 
In an another embodiment, DNA content may be analyzed by preparation of a chromosomal 
spread (Zabalou, S., 1994, Hereditas.l20:127-40; Pardue, 1994, Meth. Cell Biol. 44:333- 
351). 

25 Further assays include but are not limited to detection of changes in length of 

the cell cycle or speed of cell cycle. In one embodiment the length of the cell cycle is 
determined by the doubling time of a population of cells. In another embodiment, F ACS 
analysis is used to analyze the phase of cell cycle progression, or purify Gl, S, and G2/M 
fractions {see e.g., Delia, D., et al., 1997, Oncogene 14:2137-47). In a further embodiment, 
30 length or speed of the cell cycle of a test population is compared to wildtype populations. 
Lapse of cell cycle checkpoints), and/or induction of cell cycle 
checkpoint(s), may be examined by the methods described herein, or by any method known 
in the art. Without limitation, a cell cycle checkpoint is a mechanism which ensures that a 
certain cellular events occur in a particular order. Checkpoint genes are defined by 
35 mutations that allow late events to occur without prior completion of an early event 

(Weinert, T., and Hartwell, L., 1993, Genetics, 134:63-80). Induction or inhibition of cell 
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cycle checkpoint genes may be assayed, for example, by Western blot anal., 
immunostaining, etc. Lapse of cell cycle checkpoints may be further assessed b> 
progression of a cell thru the checkpoint without prior occurrence of specific events (e.g. 
progression into mitosis without complete replication of the genomic DNA). 

Other methods will be apparent to one skilled in the art and are within the 
scope of the invention. 




5.9.4. OTHER FUNCTIONAL ASSAYS 

For functional assays of a reporter or target protein, beyond substrate binding, 
10 the following activities can be investigated using cells expressing a reporter or target protein 
of the invention after exposing said cells to crude or purified fractions of reporter or target 
protein and comparing these results with those obtained with other reporter or target proteins 
described above (Yamaguchi et al., 1995, Biochemistry 34:4962-4968). Assayable 
functional activities include but are not limited to stimulation of cell proliferation; inhibition 
15 of cell proliferation; cell death; cell membrane rupture; alterations in cell membrane 

integrity; stimulation of overall tyrosine kinase activity by immunoblotting of cell extracts 
with an anti-phosphotyrosine antibody; alteration of specific substrates in the biological- 
pathway in which the reporter or target are associated and immunoprecipitation with 
antibodies that specifically recognize the substrate protein; and stimulation of other 
20 enzymatic activities linked to the biological-pathway. 



5.10. ASSAYS FOR CHANGES IN GENE EXPRESSION 

This invention provides assays for detecting changes in the expression of the 
reporter or target genes and proteins. Assays for changes in gene expression are well known 

25 in the art (see e.g., PCI Publication No. WO 96/34099, published October 31, 1996, which is 
incorporated by reference herein in its entirety). Such assays may be performed in vitro 
using transformed cell lines, immortalized cell lines, or recombinant cell lines, or in vivo 
using animal models. 

In particular, the assays may detect the presence of increased or decreased 

30 expression of a reporter or target gene or protein on the basis of increased or decreased 
mRNA expression (using, e.g., nucleic acid probes), increased or decreased levels of related 
protein products (using, e.g., the antibodies disclosed herein), or increased or decreased 
levels of expression of a marker gene (e.g., p-galactosidase or luciferase) operably linked to 
a 5* regulatory region in a recombinant construct. 

35 In yet another series of embodiments, various expression analysis techniques 

may be used to identify genes which are differentially expressed between two conditions, 
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such as a cell line or animal expressing a normal reporter or target gene compared to another 
cell line or animal expressing a mutant reporter or target gene. Such techniques comprise 
any expression analysis technique known to one skilled in the art, including but not limited 
to differential display, serial analysis of gene expression (SAGE), nucleic acid array 
5 technology, subtractive hybridization, proteome analysis and mass-spectrometry of two- 
dimensional protein gels. In a specific embodiment, nucleic acid array technology (e.g. , 
microarrays) may be used to determine a global (i.e., genome-wide) gene expression pattern 
in a normal S. cerevisiae animal for comparison with an animal having a mutation in one or 
more S. cerevisiae reporter or target genes. 
t Q jo elaborate further, the various methods of gene expression profiling 

mentioned above can be used to identify other genes (or proteins) that may have a functional 
relation to (e.g., may participate in a signaling pathway with) a known gene. For example, 
gene identification of such other genes is made by detecting changes in their expression 
levels following mutation, i.e., insertion, deletion or substitution in, or overexpression,, 
15 underexpression, mis-expression or knock-out, of an S. cerevisiae ergosterol-pathway gene, 
as described herein. Expression profiling methods thus provide a powerful approach for 
analyzing the effects of mutation in an S. cerevisiae ergosterol-pathway gene, or any reporter 
or target gene of the invention. 

Methods of gene expression profiling are well-known in the art, as 
20 exemplified by the following references describing subtractive hybridization (Wang and 
Brown, 1991, Proc. Natl. Acad. Sci. U.S.A. 88:11505-11509), differential display (Liang and 
Pardee', 1992, Science 257:967-971), SAGE (Velculescu et al., 1995, Science 270:484-487), 
proteome analysis (Humphery-Smith et al., 1997, Electrophoresis 18:1217-1242; Dainese et 
al., 1997, Electrophoresis 18:432-442), and hybridization-based methods employing nucleic 
25 acid arrays (Heller et al., 1997, Proc. Natl. Acad. Sci. U.S.A. 94:2150-2155; Lashkari et al., 
1997, Proc. Natl. Acad. Sci. U.S.A. 94:13057-13062; Wodicka et al., 1997, Nature 

Biotechnol. 15:1259-1267). 

In a preferred specific embodiment of the invention expression analysis 
techniques are used to identify genes which are differentially expressed upon treatment of a 
30 cell with a drug, or by other perturbations. In a further specific embodiment, genes which 
are co-regulated (e.g., up-regulated upon treatment with a particular drug or antifungal 
agent) are mapped to gene sets using deletion mutants (See, e.g., Section 6.2) and microarray 
technology described herein. Still further, labeled cDNAs corresponding to a deletion 
mutant from drug treated or untreated cells are hybridized to a single microarray. 

35 
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5.1 1. REPORTER OR TARGET GENE REGULATORY ELEMENTS 

This invention provides methods for using reporter or target gene regulatory 
DNA elements to identify cells, genes, and factors that specifically control reporter or target 
protein production. In one embodiment, regulatory DNA elements, such as 
5 enhancers/promoters, from S. cerevisiae ergosterol-pathway genes are useful for identifying 
and manipulating specific cells that synthesize an ergosterol-pathway protein. Such cells are 
of considerable interest since they are likely to have an important regulatory function within 
the fungus in controlling growth, development, reproduction, and/or metabolism. Analyzing 
components that are specific to a reporter or target secreting cells is likely to lead to an 
10 understanding of how to manipulate these regulatory processes, either for therapeutic 
applications, such as antifungal or fungicide applications, as well as an understanding of 
how to diagnose dysfunction in these processes. For example, it is of specific interest to 
investigate whether there are pathways genes in S. cerevisiae that might have a function 
related to that of the mammalian cholesterol pathway in sensing and controlling metabolic 
15 activity through the production of an ergosterol-pathway-like protein. Regulatory DNA 
elements derived from reporter or target genes provide a means to mark and manipulate such 
cells, and further, identify regulatory genes and proteins, as described below. 

5.11.1. PROTEIN-DNA BINDING ASSAYS 
20 In a third embodiment, reporter or target gene regulatory DNA elements are 

also useful in protein-DNA binding assays to identify gene regulatory proteins that control 
the expression of such reporter or target genes. Such gene regulatory proteins can be 
detected using a variety of methods that probe specific protein-DNA interactions well known 
to those skilled in the art (Kingston, 1998, In Current Protocols in Molecular Biology, 
25 Ausubel et al, John Wiley & Sons, Inc., sections 12.0.3-12.10) including in vivo footprinting 
assays based on protection of DNA sequences from chemical and enzymatic modification 
within living or permeabilized cells, in vitro footprinting assays based on protection of DNA 
sequences from chemical or enzymatic modification using protein extracts nitrocellulose 
filter-binding assays and gel electrophoresis mobility shift assays using radioactively labeled 
30 regulatory DNA elements mixed with protein extracts. In particular, it is of interest to 
identify those DNA binding proteins whose presence or absence is specific to a reporter or 
target protein as judged by comparison of the DNA-binding assays described above using 
cells/extracts which express one or more reporter or target gene(s) versus other cells/extracts 
that do not express the same reporter or target genes. For example, a DNA-binding activity 
35 that is specifically present in cells that normally express an ergosterol-pathway protein might 
function as a transcriptional activator of an ergosterol-pathway reporter or target gene; 
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conversely, a DNA-binding activity that is specifically absent in cells that normally express 
an ergosterol-pathway reporter or target protein might function as a transcriptional repressor 
of the ergosterol-pathway gene. Having identified candidate reporter or target gene 
regulatory proteins using the above DNA-binding assays, these regulatory proteins can 
5 themselves be purified using a combination of conventional and DNA-affinity purification 
techniques. In this case, the DNA-affinity resins/beads are generated by covalent attachment 
to the resin of a small synthetic double stranded oligonucleotide corresponding to the 
recognition site of the DNA binding activity, or a small DNA fragment corresponding to the 
recognition site of the DNA binding activity, or a DNA segment containing tandemly 
10 iterated versions of the recognition site of the DNA binding activity. Alternatively, 

molecular cloning strategies can be used to identify proteins that specifically bind a reporter 
or target gene regulatory DNA elements. For example, an S. cerevisiae cDNA library in an 
E. coli expression vector, such as the lambda-gtl 1 vector, can be screened for S. cerevisiae 
cDNAs that encode ergosterol-pathway gene regulatory element DNA-binding activity by 
15 probing the library with a labeled DNA fragment, or synthetic oligonucleotide, derived from 
the ergosterol-pathway gene regulatory DNA, preferably using a DNA region where specific 
protein binding has already been demonstrated with a protein-DNA binding assay described 
above (Singh et al., 1989, Biotechniques 7:252-61). Similarly, the yeast "one-hybrid" 
system can be used as another molecular cloning strategy (Li and Herskowitz, 1993, Science 
20 262:1870-4; Luo, et al., 1996, Biotechniques 20(4):564-8; Vidal, et al., 1996, Proc. Natl. 
Acad. Sci. U.S.A. 93(1 9): 103 15-20). In this case, the ergosterol-pathway gene regulatory 
DNA element, for example, is operably fused as an upstream activating sequence (UAS) to 
one, or typically more, yeast marker genes such as the lacZ gene, the URA3 gene, the LEU2 
gene, the HIS3 gene, or the LYS2 gene, and the marker gene fusion construct(s) inserted into 
25 an appropriate yeast host strain. It is expected that in the engineered yeast host strain the 
reporter genes will not be transcriptionally active, for lack of a transcriptional activator 
protein to bind the UAS derived from, for example, the S. cerevisiae ergosterol-pathway 
gene regulatory DNA. The engineered yeast host strain can be transformed with a library of 
S. cerevisiae cDNAs inserted in a yeast activation domain fusion protein expression vector, 
30 e.g. pGAD, where the coding regions of the S. cerevisiae cDNA inserts are fused to a 

functional yeast activation domain coding segment, such as those derived from the GAL4 or 
VP16 activators. Transformed yeast cells that acquire S. cerevisiae cDNAs that encode 
proteins that bind the gene regulatory element can be identified based on the concerted 
activation the marker genes, either by genetic selection for prototrophy (e.g., LEU2, HIS3, or 
35 LYS2 reporters) or by screening with chromogenic substrates (lacZ reporter) by methods 
known in the art. 
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6. EXAMPLES 

The following examples are provided merely as illustrative of various aspects 
of the invention and shall not be construed to limit the invention in any way. 

6.1. CHARACTERIZATION OF S. CEREVISIAE 
ERGOSTEROL-PATHWAY GENES 

A group of S. cerevisiae genes have been discovered as novel reporters of the 
ergosterol-pathway in the model organism S. cerevisiae. This invention provides the 
following examples of characterization of five S. cerevisiae ergosterol-pathway reporter 
genes described in detail below. 

6.1.1. THE ERGOSTEROL PATHWAY 
Ergosterol is the primary membrane sterol in fungi and in some 

trypanosomes. Ergosterol serves a structural role comparable to that of cholesterol in 
mammalian ceils, and is essential for the integrity and structure of the fungal cell membrane. 
As depicted in FIG.9, the ergosterol synthesis pathway contains at least 18 genes designated 
ERG1 though EGR26. Several different classes of antifungal agents exist which target the 
ergosterol-pathway. 

6.1.2. CONSTRUCTION OF DELETION MUTANT 

20 

Deletion mutants were constructed by standard techniques, essentially as 
described by Rothstein, B.,1991, Meth. Enzymol 194:281-301, which is incorporated herein 
by reference in its entirety. Specifically, a deletion mutant of the entire coding region of 
YER044C of S. cerevisiae was constructed in which the ORF YER044C was replaced by a 
^ dominant selectable marker (the kanamycin resistance gene) from Escherichia coli 

(Shoemaker, D. et al.,1996, Nature Gen. 14: 450-56; Rothstein, B.,1991, Meth. Enzymol 
194:281-301; Baudin, A, et aL, 1993, Nuci. Acids Res. 21:3329-30). This deletion mutant 
(R71 1) has been deposited with with Research Genetics (Huntsville, AL) Deletion 
Consortium Strain #177. Briefly, the bacterial kanamycin resistance cassette (Wach, A et 
3Q al., 1994, Yeast 10:1793-1808) was PCR amplified with primers that added homology to the 
YER044C locus, to direct homologous integration of the dominant selectable marker. Cell 
were then transformed with the PCR product. Cell were then selected for G418 resistance, 
and the gene replacement was confirmed by PCR with the appropriate primers flanking the 
YER044C locus. 

^ The other genes deletions described in subsections below (e.g., BAR1, FUS3, DIG1, 

and DIG2) genes were constructed using the same techniques as for YER044C. 
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6.1.3. GROWTH OF YEAST STRAINS AND DRUG TREATMENT 

To assess the effects of pharmacologic inhibition of ergosterol biosynthesis, 
wild-type S. cerevisiae strain R174, (also known as strain BY4741, Brachmann, C, et al, 
1998, Yeast, 14(2):1 15-32) was grown to early log-phase in YPD rich medium at 30°C. The 
5 culture was then split into 5 flasks and clotrimazole was added to a cultures at a final 

concentration of 0.03, 0.1, 1.0, and 3.0 ug/ml. The cultures were then incubated at 30°C for 
12 hours. Cells were then harvested, lysed and poly A+ RNA extracted, by methods known 
in the art. Specifically, cells were harvested and lysed by standard methods (In Current 
Protocols in Molecular Biology, Ausubel et al, John Wiley & Sons, Inc.) with the following 
10 modifications: Cell pellets were resuspended in breaking buffer (0.2M Tris HCI, pH 7.6 / 
0.5M NaCl / lOmL EDTA / 1% SDS), mixed for 2 minutes on a multi-tube vortex mixer at 
setting 8 in the presence of 60% (v/v) glass beads (425-600 urn mesh; Sigma, St. Louis, 
MO) and phenolxhloroform (50:50 v/v). Following separation of the phases, the aqueous 
phase, containing the total RNA, was reextracted and ethanol precipitated. Poly A+ RNA 
15 was isolated by two sequential chromatographic purifications over oligo dT cellulose (New 
England Biololabs Inc, Beverly, MA), as described In Current Protocols in Molecular 
Biology, Ausubel et al, John Wiley & Sons, Inc. 

To assess the effects on the ergosterol pathway of deleting the YER044C gene, yeast 
strains R174 (wild type) and R71 1 (yer044c::kanR) were grown to early log phase in YPD 
20 medium, and harvested for preparation of polyA mRNAs. 

6.1.4. PREPARATION AND HYBRIDIZATION 
OF THE LABELED cDNA 

Fluorescentlylabeled cDNA was prepared by reverse transcription of poly A+ 

RNA in the presence of Cy3- (+ drug) or Cy5- (-drug) deoxynucleotide triphosphates. 

Fluorescently labeled cDNAs were also purified, and hybridized essentially as described in 

DeRisi, J., 1997, Science 278:680-86, which is incorporated herein by reference in its 

entirety. Briefly, Cy3- or Cy5-dUTP (Amersham) was incorporated into cDNA during 

reverse transcription (Superscript II, Life Technologies, Inc., Gaithersburg, MD). Labeled 

cDNAs were then concentrated to less than 10 ul using Microcon-30 microconcentrators 

30 (Amicon, Milhpore, Corp,. Bedford, MA). Labeled cDNAs from drug treated or untreated 
cells were then resuspended in 20-26 ul hybridization solution (3X 55G. 0.75 ug/ml poly A 
DNA, 0.2% SDS) and applied to the microarray (described below in section 6.2.3) under a 
22x30 mm coverslip for 6 h. Both drug treated and untreated samples were simultaneously 
hybridized to the microarray as described in U.S. Patent serial No. 179,569, filed October 

35 27, 1998 now pending, U.S. Patent serial No. 09/220,275 filed December 23, 1998, now 
pending, and U.S. Patent serial No. 09/220,142, filed December 23, 1998 now pending, 

- 93 - 



BNSDOCID: <WO 0058S20A1_I_> 



WO 00/58520 



PCT/US00/08555 



which are incorporated herein by reference in their entirety. Under these conditions, drug 
treatment resulted in a signature pattern of altered gene expression in which mRNA levels of 
about 500 ORFs changed by at least twofold. 

Alternatively, fluorescently-labeled cDNA was prepared, as above, by 
5 reverse transcription of polyA+ RNA from the YER044C deletion mutant and hybridized to 
the microarray. The signature of the deletion mutant was then compared to the signature of 
the drug-treated cells, as described below. 

6.1.5, FABRICATION OF MICROARRAYS 

10 PCR products containing common 5* and 3' sequences were obtained from 

Research Genetics (Huntsville, AL), and used as templates with amino-modified forward 
primers and unmodified reverse primers to amplify 6065 ORFs from the yeast genome. 
Amplification reactions that gave products of unexpected sizes were excluded from 
subsequent analysis. ORFs that could not be amplified from purchased templates were 
15 amplified from genomic DNA. DNA samples from 100 ul reactions were precipitated with 
isopropanol, resuspended in water, brought up to a total volume of 15 ui in 3X SSC, and 
transferred to 384-well microtiter plates (Genetix Ltd, Dorset, United Kingdon). PCR 
products were robotically spotted ontol x 3 inch polylysine-coated glass slides. After 
printing, slides were processed as described in DeRisi et al. supra. 100% of the total ORFs 
20 of the yeast geneone were amplified and attached to the mircoarray, thus a DNA microarray 
consisting of more than 6000 oligonucleotides representing each of the known or predicted 
ORFs in the yeast genome was prepared. 

6.1.6. SCANNING AND IMAGING OF MICROARRAYS 
Microarrays to which labeled cDNAs had been hybridized were then imaged 
on a prototype multi-frame charge-coupled device (CCD) camera (Applied Precision, 
Seattle, WA). Each CCD image frame was approximately 2 mm square. Exposure times of 2 
sec in the Cy5 channel (white light through a Chroma 618-648 nm excitation filter, Chroma 
657-727 nm emission filter) and 1 sec in the Cy3 channel (Chroma 53 5-560 nm excitation 
filter, Chroma 570-620 nm emission filter) were taken consecutively in each frame before 
moving to the next, spatially contiguous frame. Color isolation between the Cy3 and Cy5 
channels was 100:1 or better. Frames were knitted together in software to make the complete 
images as in U.S. Patent serial No. 179,569, filed October 27, 1998 now pending, U.S. 
Patent serial No. 09/220,275 filed December 23, 1998, now pending, and U.S. Patent serial 
No. 09/220,142, filed December 23, 1998 now pending, which are incorporated herein by 
reference in their entirety. The intensity of each spot was quantified from the 10 urn pixels 
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by frame-by-frame background subtraction and intensity averaging in each channel. 
Normalization between the channels was accomplished by normalizing each channel to the 
mean intensities of all genes. 

6.1.7. ASSIGNMENT OF YEAST ORFS TO THE ERGOSTEROL 
PATHWAY USING DNA MICRO ARRAY 

The ORFs which are the subject of the present invention were discovered to 
be within the ergosterol pathway using DNA microarray technology (U.S. Patent serial No. 
179,569, filed October 27, 1998 now pending, U.S. Patent serial No. 09/220,275 filed 
December 23, 1998, now pending, and U.S. Patent serial No. 09/220,142, filed December 
23, 1998 now pending, which are incorporated herein by reference in their entirety). 

Clotrimazole treatment of yeast resulted in the upregulation of aproximately 
500 genes, many of which were induced by a wide variety of differeint types of 
perturbations of yeast. To determine which of these genea specifically assocoated with the 
ergosterol-pathway, the clotrimazole transcriptional signatures were compared with many 
other drug treatments and mutant signatures. 

The similarity of signatures was quantified using the correlation coefficient. 
Correlation coefficients between the signature ORFs of various experiments were calculated 
according to Equation 4 in section 5.1 above, i.e., by the equation: 

20 v .. v . 2>r> * v?>) 



10 



E(v: ,o ) 2 Z(vr > ) : 



1/2 



(10) 



25 

where v, w and v. w are the log, 0 of the expression ratio for the genes i and./, respectively, in 
response to perturbation n. The summation was over those genes that were either up- or 
down-regulated in either experiment at the 95% confidence level. These genes each had less 
than a 5% chance of being actually unregulated, that is, having expression ratios departing 
from unity due to measurement errors alone. This confidence level was assigned based on an 

3 error model which assigns a log normal probability distribution to each gene's expression 
ratio with characteristic width based on the observed scatter in its repeated measurements 
and on the individual array hybridization quality. This latter dependence was derived from 
control experiments in which both Cy3 and Cy5 samples were derived from the same RNA 

35 sample. As negative controls, deletion mutants known to affect pathways unrelated to 
ergosterol biosynthesis were analyzed. However, the mutant deleted in YER044C, which 
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had not previously been assigned any function in the yeast genome, also gave a signature 
that correlated positively with the signature of drug-treated cells. 

Using this analysis, two genes designated YHR039C and YLRlOOw were 
discovered to cluster on the same branch (as seen in FIG. 14) and were associated with the 
5 ergosterol pathway. These genes have been assigned as reporters of the ergosterol pathway. 
Three other genes have also been discovered to co-cluster on a second branch (as seen in 
FIG. 14) and have been discovered to be associated with the ergosterol pathway. These 
three genes YPL272c , YGR131c, and YDR453c were found to tightly cluster and have 
therefore been discovered to be associated with the ergosterol-pathway and act as novel 
10 reporters for the ergosterol pathway. 

Taken together, these data indicated that five 5. cerevisiae genes, designated 
YLR100W, YHR039C, YGL001C, YPL272c , YGR131c, and YDR453c were involved in 
the ergosterol biosynthesis pathway and were novel reporters for the pathway. One or a 
combination of these genes may also serve as targets for antifungal drug development. 

15 

6.2. CHARACTERIZATION OF S. CEREVISIAE 
PKC-PATHWAY GENES 

A group of S. cerevisiae genes have been discovered as novel reporters and/or 
targets of the PKC-pathway in the model organism S. cerevisiae. This invention provides 
2Q the following examples of characterization of six S. cerevisiae PKC-pathway reporter genes 
described in detail below. Two of these S. cerevisiae PKC-pathway reporter genes have 
been further validated as target genes and are described in detail below. 



6.2.1. THE PKC PATHWAY 

Protein kinase C (PKC) is a highly conserved protein throughout all 
eukaryotes. In the yeast S. cerevisiae PKC regulates the (MAP) kinase cascade, which is 
required for maintenance of cell integrity during periods of asymmetric or polarized growth. 
FIG. 15 shows a diagram of the PKC pathway in yeast, and demonstrates the reporters and 
target genes in the PKC pathway that have been discovered by the methods of the invention. 

PKC plays a role in regulating the formation of a mating projection. The 
mating signal is transmitted to PKC through the activities of another Rho-GTPases, CDC42, 
and BNIl,and RHOl. 

6.2.2. NOVEL PKC REPORTER AND TARGET GENES 
In order to illustrate the methods of the invention, DNA microarray analysis 
was used to find reporters ans target genes of the PKC pathway. The transcriptional activity 
of yeast genes across a diverse number of experimental treatments of yeast, including a large 
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number of drag treatments and mutations, as well as many experiments involving activation 
of the yeast mating process were used in the clustering analysis methods of the invention. 
Perturbation of the cells for PKC experiments was performed by constructing constitutively 
activated alleles of PKC (PKC1-R398A) or RHOl (RHO-Q68H). Expression of these 
alleles were placed under the control of the inducible GAL1/10 promoter, and served as the 
perturbation. Cells containing constitutively activated alleles of PKC or RHOl were 
compared to control cells lacking such activated alleles. 

The yeast strains used to find reporter of the PKC pathway as are follows: 



10 



R4084 = MATa barlr.kanR trpl-63 his3-200 leu2-0 metl5-0 ura3-0 pRS316 (CEN URA3) 

R4081 = MATa barl :kanR trpl-63 his 3-200 leu2-0 metl5-0 ura3-0 pGAL-RHOl (GALlp- 
RH01-Q68H, CEN, URA3) 

15 R4075 = MATa barl::kanR leu2-0 his3-l ura3-0 trpl-63 pGAL-PKC (GALlp-PKCl- 
R398A, 2 micron, URA3) 

R4081 contained the plasmid pGAL-RHOl, with the RH01-Q68H gene controlled by the 
GAL1 promoter, on a low copy CEN, URA3-based plasmid. R4084 was a similar strain, 
20 only contained the plasmid pRS316, which is similar to pGAL-RHOl except it lacks the 
RH01-Q68H gene. R4075 was also similar to R4081,except it contained the plasmid 
pGAL-PKC, with the PKC1-R398A gene on a high copy 2 micron, URA3-based plasmid. 

For PKC experiments, R4084 and R4075 or R4084 and R408 1 were grown as 
pairs of cultures that were treated identically. The strains were grown as overnight cultures 
25 at 30C in SC-ura (synthetic complete medium minus uracil; yeast nitrogen base, ammonium 
*~ sulfate, and the complete set of amino acid supplements except uracil) with raffinose as the 
carbon sources. The cells were then subcultured at a low density in fresh medium for 2 
hours, then galactose was directly added to the medium at a final concentration of 2%, and 
incubation continued for 3 hours. The cells were then harvested and total RNAs were 
30 prepared as labeled cDNAs for hybridization to microarrays. Pairs of hybridizations were 
done for each comparison, with the Cy3 and Cy5 fluors reserved for each pair to eliminate 
color biases due to differential fluor incorporation, as described above. The competitive 
hybridization pairs were as follows: 
GAL-PKC1-R398A 
35 1 . Cy3=R4084(pRS3 16) vs Cy5 = R4075 (pGAL-PKCl -R398A) 
2. Cy3=R4075 (pGAL-PKCl-R398A) vs Cy5 = R4084 (pRS316) 
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pGAL-RH01-Q68H: 

1. Cy3=R4084 (pRS316) vs Cy5 = R4081 (pGAL-RH01-Q68H) 

2. Cy3=R4081 (pGAL-RHO!-Q68H) vs Cy5 = R4084 (pRS316) 

5 Results of cell perturbation by PKC activated alleles resulted in a large 

transcriptional response and co-clustered genesets. Comparison of the activated allele 
experiments to other experiments in the database (e.g., controls) using 2D clustering as 
described in U.S. Patent serial No. 09/220,275 filed December 23, 1998, now pending, and 
U.S. Patent serial No. 09/220,142, filed December 23, 1998 now pending, revealed novel 
10 reporter genes whose expression is activated only under conditions of PKC activation. 
These genes included PIR3, YPK2, YLR194C, YDR055W, SLT2 and YKL161C were 
discovered to be novel reporters of the PKC pathway. These four genes may serve as novel 
targets for inhibiting or modulating activation of the PKC pathway. Further, two of the 
genes, SLT2 and YKL161c were found to be located in the PKC pathway, and have 
15 therefore been discovered to serve as target genes of the PKC pathway. 

Such novel PKC pathway-specific reporters have a wide variety of uses, 
including for example use in high throughput, cell based assays for general compounds 
activate PKC. Target genes have a wide variety of uses such as providing a target for which 
a drug designed to activate, inhibit or modify the PKC pathway may be designed and tested. 
20 Such target genes may also serve as the substrate or binding partner for a drug or compound 
which is tested for activity in activating, inhibiting or modifing the PKC pathway, or cellular 
responses and phenotypes associated with the PKC pathway, including for example, cell 
wall integrity. 

63. CHARACTERIZATION OF S. CERE VIS IAE 
INVASIVE GROWTH PATHWAY GENES 

A group of S. cerevisiae genes have been discovered as novel reporters and/or 
targets of the Invasive Growth pathway in the model organism S. cerevisiae. This invention 
provides the following examples of characterization of four S. cerevisiae Invasive Growth 
pathway reporter genes described in detail below. Two of these S. cerevisiae pathway 
reporter genes have been further validated as target genes. 

6.3.1. THE INVASIVE GROWTH PATHWAY 

The yeast S. cerevisiae is dimorphic in that it can either proliferate either by 
budding or by forming multicellular filaments called pseudohyphae, which can invade the 
agar (Madhani and Fink, 1998, Trends Cell Biol 1998 Sep;8(9):348-53). Diploid cells 
undergo the Invasive Growth pathway in response to nitrogen starvation, whereas haploid 
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cells undergo the Invasive Growth pathway and form invasive filaments on rich medium. 
The mitogen-activated protein (MAP) kinase cascade is diagramed in FIG. 15. 

6.3.2. NOVEL INVASIVE GROWTH 

REPORTER AND TARGET GENES . 

5 DMA microarray analysis of the genome of normal and mutant yeast strains 

was combined with two dimensional (2D) clustering analysis of the behaviors of 6000 genes 
across many perturbations. Using cluster analysis, a group of genes were identified to be 
indued transcriptionally in response to perturbations of the Invasive Growth pathway. Genes 
which were indued specifically to perturbations of the Invasive Growth pathway, were 
l ° therefore discovered to be reporters for the Invasive Growth pathway. These genes included 
PGU1, YLR042C, SVS1, and KSS1 gene. 

In order to search for Reporter genes of the Invasive Growth pathway, yeast 
strains with particular mutations (e.g., perturbations) were used as follows. The fus3 strain 
R500 (MATabarl::kanR ura3-0 leu2-0 his3-l metl5-0 fus3::URA3) or the digl dig2 strain 
15 R4063 (MATa barl::kanR ura3-0 leu2-0 his3-l metl5-0 digl::LEU2 dig2::URA3), or the 
isogenic wild type parent, R276 (MATa barl ::kanR ura3-0 leu2-0 his3-l metl5-0), were 
grown as overnight cultures by standard methods in the art. Each culture was then diluted 
and grown to log phase. Alpha factor treatment was performed by adding 50 nM alpha 
factor directly to the cultures and incubating for 30 minutes. The cells were then harvested, 
20 total RNA was prepared by standard methods in the art, and polyA mRNAs were selected on 
oligo-dT cellulose. Next, fluorescently labeled cDNAs were prepared for DNA microarray 
experiments as described above. The following hybridizations were performed: 



25 



30 



35 



1. Strain R276 (wild type) vs. R500 (fus3), no alpha factor. 

2. Strain R276 (wild type) + 50 nM alpha factor, 30 minutes, vs strain R500 

(fus3) + 50 nM alpha factor, 30 min. 

3. R276 vs. R4063 (digl dig2), neither with alpha factor. 

The results of the hybridization experiments were examined by correlating 
the signatures to the signatures from a wide variety of other experiments, and by cluster 
analysis of gene behaviors across all these experiments. Four genes were found to be 
induced specifically in experiments in which the Invasive Growth pathway was activated, 
including KSSl.PGUl, YLR042C, and SVS1. Surprisingly, the MAPKKSS1 gene serves 
as a specific reporter and target for experiments in which KSS1 is active. 

- 99 - 



BNSDOCID: <WO 005652OA1_ 



WO 00/58520 



PCT/USOO/08555 



These target genes provide useful for screening for compounds that block 
invasive growth in S. cerevisiae. Because many aspects of the invasive growth pathway are 
conserved between S. cerevisiae and other pathogenic fungi, such as Candida albicans, and 
the switch to filamentous growth is essential for C. albicans virulence, such drugs will 
5 serves as novel antifungal agents. 

The KSS1 gene will serve as a useful reporter for activation of the invasive 
growth pathway, since it has been discovered that induction of this gene is highly specific 
for this pathway. The use of combinations of two or more of the four invasive growth 
reporter genes will serve to greatly increase the sensitivity of such a reporter assay. 
10 Each of the other genes have been discovered to be induced by other 

cellular perturbations. Specifically, PGU1 and YLR042C were found to be induced by 
treatment (e.g., perturbation) with the peptide pheromone, alpha factor. SVS1 was found to 
be repressed by alpha factor perturbation. Mutants deleted for the DIG1 and DIG2, in the 
absence of alpha factor, also showed increased transcription of the four genes PGU1, 
15 YLR042C,SVSl,andKSSl. Mutants deleted for the FUS3 MAPK, also showed several 
fold upregulation of the PGU1, YLR042C, SVS1, and KSS1 genes. Additionally, each of the 
PGU1, YLR042C, SVS1, and KSS1 genes were induced by activation of KSS1. 

Such target genes may also serve as a substrate or binding partner for a drug 
or compound which is tested for activity in activating, inhibiting or modifying the Invasive 
20 Growth pathway, or cellular responses and phenotypes associated with the Invasive Growth 
pathway, including for example, invasion of fungus or pathogenicity of fungus. 

6.4. NOVEL REPORTER AND TARGET GENES 

A group of S. cerevisiae genes have been discovered by the methods of the 
25 invention as novel reporters and/or targets of the for pathways in the model organism S. 
cerevisiae. Table I, below lists such genes and there associated pathways, as well as the 
corresponding SEQ ID NOs. 



30 Gene Name 



Pathway 



TABLE I 



FIG. 



SEQ ID NO. 



YHR039C 



Ergosterol 



2 
3 



1 

2 



DNA 
Protein 



35 YLR100W 



Ergosterol 



4 
5 



3 
4 



DNA 
Protein 
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YPL272C 



Ergosterol 



6 
7 



5 
6 



DNA 
Protein 



YGR131W 



Ergosterol 



8 
9 



7 
8 



DNA 
Protein 



YDR453C 



Ergosterol 



10 
11 



9 
10 



DNA 
Protein 



10 SLT2(YHR030C) PKC 



17A-B 
18 



11 
12 



DNA 
Protein 



YKL161C 



15 



PKC 



PIR3(YKL163W) PKC 



19A-B 
20 

21A-B 
22 



13 
14 

15 
16 



DNA 
Protein 

DNA 
Protein 



YPK2(YMR104C) PKC 



20 



YLR194C 



PKC 



23A-B 
24 

25A-B 
26 



17 
18 

19 
20 



DNA 
Protein 

DNA 
Protein 



25 PST1(YDR055W) PKC 



27A-B 
28 



21 

22 



DNA 
Protein 



KSS1(YGR040W) Invasive Growth 



30 



PGU1 (YJR1 53W) Invasive Growth 



29 
30 

31 
32 



23 
24 

25 
26 



DNA 
Protein 

DNA 
Protein 



YLR042C 



35 



Invasive Growth 



33 
34 



27 
28 



DNA 
Protein 
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SVS1(YPL163C) Invasive Growth 35 29 DNA 

36 30 Protein 

The present invention is not to be limited in scope by the specific 
5 embodiments described herein. Indeed, various modifications of the invention in addition to 
those described herein will become apparent to those skilled in the art from the foregoing 
description and accompanying drawings. Such modifications are intended to fall within the 
scope of the appended claims. 

Various references are cited herein above, including patent applications, 
10 patents, and publications, the disclosures of which are hereby incorporated by reference in 
their entireties. 



15 



20 



25 



30 



35 
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WHAT IS CLAIMED IS: 

1 . A method of identifying a reporter gene for a particular biological 
pathway in a cell comprising identifying a gene which clusters to a geneset associated with 

5 the biological pathway, wherein said gene which clusters to the geneset associated with the 
particular biological pathway is a reporter gene. 

2. The method of claim 1 , wherein a geneset associated with the 
particular biological pathway is identified by a method comprising identifying one or more 

10 genes in a geneset which are associated with the particular biological pathway, wherein said 
geneset having one or more genes associated with the particular biological pathway is a 
geneset associated with the particular biological pathway. 

3 . The method of claim 1 , wherein a geneset associated with the 

15 particular biological pathway is identified by identifying a geneset which is activated or 
inhibited by perturbations which target the biological pathway, wherein a geneset which is 
activated or inhibited by perturbations which target the biological pathway is a geneset 
associated with the particular biological pathway. 

2Q 4. The method of claim 1 , further comprising identifying a gene which 

clusters specifically to a geneset associated with the particular biological pathway, wherein 
said gene which clusters specifically to the geneset associated with the particular biological 
pathway is a reporter gene. 

25 5 . The method of claim 4, wherein the reporter gene is further identified 

as a gene whose expression is not altered by perturbations which effect other biological 
pathways, said other biological pathways being different from said particular biological 
pathway. 



30 

comprising 



6. The method of claim 1 , wherein geneset is provided by a method 



(a) measuring changes in expression of a plurality of genes in the cell in 
response to a plurality of perturbations to the cell; and 

(b) grouping or re-ordering said plurality of genes into one or more co- 
35 varying sets, 

wherein said one or more co-varying sets comprise said geneset. 
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7. The method of claim 6, wherein said plurality of genes are grouped or 
re-ordered into one or more co-varying sets by means of a pattern recognition algorithm. 

8. The method of claim 7, wherein the pattern recognition algorithm is a 
5 clustering algorithm. 

9. The method of claim 8, wherein the clustering algorithm analyzes 
arrays or matrices, said arrays or matrices representing said measured changes in expression 
of the plurality of genes in the cell in response to the plurality of perturbations to the cell, 

10 wherein said analysis determines dissimilarities between individual genes. 

10. The method of claim 6, wherein said plurality of perturbations to the 
cell are also grouped or re-ordered according to their similarity. 

15 11. The method of claim 10, wherein said plurality of perturbations to the 

cell are grouped or re-oredered by means of a pattern recognition algorithm. 

12. The method of claim 1 1, wherein the pattern recognition algorithm is 
a clustering algorithm. 

20 

13. The method of claim 12, wherein the clustering algorithm analyzes 
arrays or matrices, said arrays or matrices representing said measured changes in expression 
of the plurality of genes in the cell in response to the plurality of perturbations to the cell. 

25 14. The method of claim 1 , wherein the reporter gene is further identified 

as has a high level of induction. 

15. The method of claim 14, wherein expression of the reporter gene is 
further identified to change by at least a factor of two in response to perturbations of the 

30 particular biological pathway. 

1 6. The method of claim 1 5 , wherein expression of the reporter gene is 
further identified to change by at least a factor of 10 in response to perturbations to the 
particular biological pathway. 

35 
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17. The method of claim 1 6, wherein expression of the reporter gene is 
further identified to change by at least a factor of 100 in response to perturbations to the 
particular biological pathway. 

5 18. The method of claim 1 , wherein expression of the reporter gene is 

further identified to change in response to slight perturbations to the particular biological 
pathway. 

19. The method of claim 1 8, wherein the perturbation to the particular 

10 biological pathway comprises exposure to a drug, and said reporter gene is further identified 
to change in response to low levels of exposure to the drug. 

20. The method of claim 1 , wherein the reporter gene is further identified 
to respond to perturbations targeted to the entire particular biological pathway. 

15 

21 . The method of claim 1 , wherein the reporter gene is further identified 
to respond to perturbations targeted to one or more portions of the particular biological 
pathway. 

20 22. The method of claim 2 1 , wherein the reporter gene is further 

identified to respond to perturbations targeted to early steps of the particular biological 
pathway. 

23 . The method of claim 2 1 , wherein the reporter gene is further 
25 identified to respond to perturbations targeted to late steps of the particular biological 

pathway. 

24. The method of claim 1 , wherein the reporter gene is further identified 
by identifying a gene which kinetically induces quickly in response to perturbations to the 

30 particular biological pathway. 

25. The method of claim 24, wherein the reporter gene is further 
identified by identifying a gene which reaches steady state within about eight hours after a 
perturbation to the particular biological pathway. 

35 
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26. The method of claim 24, wherein the reporter gene is further 
identified by identifying a gene which reaches steady state within about six hours after a 
perturbation to the particular biological pathway. 

5 27. The method of claim 24, wherein the reporter gene is further 

identified by identifying a gene which is induced within about two hours after a perturbation 
to the particular biological pathway. 

28. The method of claim 27, wherein the reporter gene is further 
10 identified by identifying a gene which is induced within about 90 minutes after a 

perturbation to the particular biological pathway. 

29. The method of claim 28, wherein the reporter gene is further 
identified by identifying a gene which is induced within about 60 minutes after a 

15 perturbation to the particular biological pathway. 

30. The method of claim 29, wherein the reporter gene is further 
identified by identifying a gene which is induced within about 30 minutes after a 
perturbation to the particular biological pathway. 

20 

3 1 . The method of claim 30, wherein the reporter gene is further 
identified by identifying a gene which is induced within about 10 minutes after a 
perturbation to the particular biological pathway. 

25 32. The method of claim 3 1 , wherein the reporter gene is further 

identified by identifying a gene which is induced within about 7 minutes after a perturbation 
to the particular biological pathway. 



33. A method of identifying a target gene for a particular biological 
pathway in a cell comprising identifying a gene which clusters to a geneset associated with 
the particular biological pathway, wherein said gene which clusters to a geneset associated 
with the particular biological pathway and is identified as a gene which is necessary for 
normal function of said particular biological pathway. 
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34. The method of claim 33, wherein a geneset associated with the 
particular biological pathway is identified by a method comprising identifying one or more 
genes in a geneset which are associated with the particular biological pathway, wherein said 
geneset having one or more genes associated with the particular biological pathway is a 

5 geneset associated with the particular biological pathway. 

35. The method of claim 33, wherein a geneset associated with the 
particular biological pathway is identified by identifying a geneset which is activated or 
inhibited by perturbations which target the biological pathway, wherein a geneset which is 

10 activated or inhibited by perturbations which target the biological pathway is a geneset 
associated with the particular biological pathway. 

36. The method of claim 33, wherein genesets are provided by a method 

comprising: 

2 5 (a) measuring changes in expression of a plurality of genes in the cell in 

response to a plurality of perturbations to the cell; and 
(b) grouping or re-ordering said plurality of genes into one or more co- 
varying sets, 

wherein said one or more co-varying sets comprise said genesets. 

20 

37. The method of claim 36, wherein said plurality of genes are grouped 
or re-ordered into one or more co-varying sets by means of a pattern recognition algorithm. 

38. The method of claim 37, wherein the pattern recognition algorithm is 
25 a clustering algorithm. 

39. The method of claim 38, wherein the clustering algorithm analyzes 
arrays of matrices, said arrays or matrices representing said measured changes in expression 
of the plurality of genes in the cell in response to the plurality of perturbations to the cell, 

30 wherein said analysis determines dissimilarities between individual genes. 

40. The method of claim 36, wherein the plurality of perturbations to the 
cell are also grouped or re-ordered according to their similarity. 

35 4i . The method of claim 40, wherein the plurality of perturbations to the 

cell are grouped or re-ordered by means of a pattern recognition algorithm. 
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42. The method of claim 41 , wherein the pattern recognition algorithm is 
a clustering algorithm. 

43. The method of claim 42, wherein the clustering algorithm analyzes 

5 arrays of matrices, said arrays or matrices representing said measured changes in expression 
of the plurality of genes in the cell in response to the plurality of perturbations to the cell. 

44. The method of claim 1 , wherein the biological pathway is selected 
from the group consisting of: a signaling pathway, a control pathway, a mating pathway, a 
cell cycle pathway, a cell division pathway, a cell repair pathway, a small molecule synthesis 
pathway, a protein synthesis pathway, a DNA synthesis pathway, a RNA synthesis pathway, 
a DNA repair pathway, a stress-response pathway, a cytoskeletal pathway, a steroid 
pathway, a receptor-mediated signal transduction pathway, a transcriptional pathway, a 
translational pathway, an immune response pathway, a heat-shock pathway, a motility 
pathway, a secretion pathway, an endocytotic pathway, a protein sorting pathway, a 
phagocytic pathway, a photosynthetic pathway, an excretion pathway, an electrical response 
pathway, a pressure-response pathway, a protein modification pathway, a small-molecule 
response pathway, a toxic-molecule response pathway, and a transformation pathway. 

20 45 . The method of claim 1 , wherein the reporter gene is a reporter for the 

ergosterol-pathway, and the reporter gene is selected from the group consisting of: 
YHR039C (as depicted in FIG.2, as set forth in SEQ ID NO:1),YLW100W (as depicted in 
FIG.4, as set forth in SEQ ED NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ED 
NO:5), YGR131W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as 

25 depicted in FIG.10, as set forth in SEQ ID NO:9). 

46. The method of claim 1 , wherein the reporter gene is a reporter for the 
PKC-pathway, and the reporter gene is selected from the group consisting of: 
SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C (as 
30 depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in 
FIG.21A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted in FIG.23A-B, 
as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID 
NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21). 

35 47. The method of claim 33, wherein the biological pathway is selected 

from the group consisting of: a signaling pathway, a control pathway, a mating pathway, a 
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cell cycle pathway, a cell division pathway, a cell repair pathway, a small molecule synthesis 
pathway, a protein synthesis pathway, a DNA synthesis pathway, a RNA synthesis pathway, 
a DNA repair pathway, a stress-response pathway, a cytoskeletal pathway, a steroid 
pathway, a receptor-mediated signal transduction pathway, a transcriptional pathway, a 
5 translational pathway, an immune response pathway, a heat-shock pathway, a motility 
pathway, a secretion pathway, an endocytotic pathway, a protein sorting pathway, a 
phagocytic pathway, a photosynthetic pathway, an excretion pathway, an electrical response 
pathway, a pressure-response pathway, a protein modification pathway, a small-molecule 
response pathway, a toxic-molecule response pathway, and a transformation pathway. 

10 

48 . The method of claim 3 3 , wherein the target gene of the PKC-pathway 
is selected from the group consisting of: SLT2(YHR030C) (as depicted in FIG.17A-B, as set 
forth in SEQ ED NO:l 1), and YKR161C (as depicted in FIG.19A-B, as set forth in SEQ ID 
NO:13). 

15 

49. A method for determining whether a molecule affects the function or 
activity of an ergosterol pathway in a cell comprising: 

(a) contacting the cell with, or recombinantly expressing within a cell the 
molecule; and 

20 (b) determining whether the expression of one or more of the genes 

selected from the group consisting of: YHR039C (as depicted in 
FIG-2, as set forth in SEQ ID NO:1),YLW100W (as depicted in 
FIG.4, as set forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, 
as set forth in SEQ ID NO:5), YGR131W (as depicted in FIG.8, as set 

25 forth in SEQ ID NO:7), and YDR453C (as depicted in FIG. 10, as set 

forth in SEQ ID NO:9) is changed relative to said expression in the 
absence of the molecule. 

50. The method according to claim 49 which is a method for determining 
30 whether the molecule inhibits ergosterol synthesis such that a cell contacted with the 

molecule exhibits a lower level of ergosterol than a cell which is not contacted with said 
molecule. 

5 1 . The method according to claim 49 wherein step (b) comprises 
35 determining whether YPL272c expression increases. 
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52. A kit comprising in one or more containers a) a substance selected 
from the group consisting of an antibody against an ergosterol-pathway protein, a gene probe 
capable of hybridizing to RNA of an ergosterol-pathway gene, and pairs of gene primers 
capable of priming amplification of at least a portion of an ergosterol-pathway gene, and b) a 

5 molecule known to be capable of perturbing the ergosterol pathway. 

53. A method for identifying a molecule that activates the ergosterol 
pathway in yeast comprising contacting a yeast cell with one or more candidate molecules, 
and detecting a change in the RNA expression of a reporter gene for the ergosterol-pathway 

10 relative to the expression of the reporter gene in a yeast cell not contacted by the one or more 
candidate molecules, wherein the reporter gene is selected from the group consisting of: 
YHR039C (as depicted in FIG.2, as set forth in SEQ ID NO:1),YLW100W (as depicted in 
FIG.4, as set forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ID 
NO:5), YGR131 W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as 

15 depicted in FIG.10, as set forth in SEQ ID NO:9). 

54. A method for identifying a molecule that activates the ergosterol 
pathway in yeast comprising contacting a yeast cell with one or more candidate molecules, 
and detecting a change in the protein expression of a reporter gene for the ergosterol- 

20 pathway relative to the expression of the reporter gene in a yeast cell not contacted by the 
one or more candidate molecules, wherein the reporter gene is selected from the group 
consisting of: YHR039C (as depicted in FIG.2, as set forth in SEQ ED NO:1),YLW100W (as 
depicted in FIG.4, as set forth in SEQ ID NO:3),YPL272C (as depicted in FIG.6, as set forth 
in SEQ ID NO:5), YGR131 W (as depicted in FIG.8, as set forth in SEQ ID NO:7), and 

25 YDR453C (as depicted in FIG.10, as set forth in SEQ ED NO:9). 

55. The method according to claim 53, wherein the fungal cell is a 

transgenic cell. 

30 56. The method according to claim 54, wherein the fungal cell is a 

transgenic cell. 

57. A method for identifying a molecule that modulates the expression of 
an ergosterol-pathway gene selected from the group consisting of YHR039C (as depicted in 
35 FIG.2, as set forth in SEQ ED NO:1),YLW100W (as depicted in FIG.4, as set forth in SEQ 
ID NO:3),YPL272C (as depicted in FIG.6, as set forth in SEQ ED NO:5), YGR131 W (as 
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depicted in FIG.8, as set forth in SEQ ID NO:7), and YDR453C (as depicted in FIG.10, as 
set forth in SEQ ID NO:9), comprising recombinantly expressing in a fungal cell one or 
more candidate molecules, and detecting the expression of said ergosterol-pathway gene; 
wherein an increase or decrease in the gene expression relative to the expression in the 
5 absence of candidate molecules indicates that the molecules modulates ergosterol-pathway 
gene expression. 

58. The method according to claim 57, wherein the fungal cell is a 

transgenic cell. 

10 

59. A method for identifying a molecule that modulates the activity of an 
ergosterol-pathway protein selected from the group consisting of YHR039C (as depicted in 
FIG.3, as set forth in SEQ ID NO:2), YLW100W (as depicted in FIG.5, as set forth in SEQ 
ID NO:4), YPL272C (as depicted in FIG.7, as set forth in SEQ ID NO:6), YGR131W (as 

15 depicted in FIG.9, as set forth in SEQ ID NO:8), and YDR453C (as depicted in FIG.l 1, as 
set forth in SEQ ID NO: 10), comprising contacting a fungal cell with one or more candidate 
molecules, detecting said protein; wherein an increase or decrease in the protein level 
relative to the level in the absence of candidate molecules indicates that the molecule 
modulates ergosterol-pathway gene expression. 

20 

60. A method of identifying a molecule that binds to a ligand selected 
from the group consisting of (i) an S. cerevisiae ergosterol-pathway protein selected from the 
group consisting of YHR039C (as depicted in FIG.3, as set forth in SEQ ID NO:2), 
YLW100W (as depicted in FIG.5, as set forth in SEQ ID NO:4), YPL272C (as depicted in 

25 FIG.7, as set forth in SEQ ID NO:6), YGR131 W (as depicted in FIG.9, as set forth in SEQ 
ID NO:8), and YDR453C (as depicted in FIG.l 1, as set forth in SEQ ID NO: 10), (ii) a 
fragment of the S. cerevisiae ergosterol-pathway protein, and (iii) a nucleic acid encoding 
the S. cerevisiae ergosterol-pathway protein or fragment, the method comprising: 

(a) contacting the ligand with a plurality of molecules under conditions 
30 conducive to binding between the ligand and the molecules; and 

(b) identifying a molecule within the plurality that binds to the ligand. 

61. A method for determining whether a molecule affects the function or 
activity of an PKC pathway in a cell comprising: 
35 (a) contacting the cell with, or recombinantly expressing within a cell the 

molecule; and 
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(b) determining whether the expression of one or more of the genes 

selected from the group consisting of: SLT2(YHR030C) (as depicted 
in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C (as depicted 
in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as 

5 depicted in FIG.21A-B, as set forth in SEQ ID NO: 15), 

YPK2(YMR104C) (as depicted in FIG.23A-B, as set forth in SEQ ID 
NO: 17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID 
NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth 
in SEQ ID NO:21) is changed relative to said expression in the 

I o absence of the molecule. 

62. The method according to claim 61 wherein step (b) comprises 
determining whether SLT2 expression increases. 

15 63. A kit comprising in one or more containers a) a substance selected 

from the group consisting of an antibody against a PKC-pathway protein, a gene probe 
capable of hybridizing to RNA of a PKC-pathway gene, and pairs of gene primers capable of 
priming amplification of at least a portion of a PKC-pathway gene, and b) a molecule known 
to be capable of perturbing the PKC pathway. 

20 

64. A method for identifying a molecule that activates the PKC pathway 
in yeast comprising contacting a yeast cell with one or more candidate molecules, and 
detecting a change in the RNA expression of a reporter gene for the PKC-pathway relative to 
the expression of the reporter gene in a yeast cell not contacted by the one or more candidate 

25 molecules, wherein the reporter gene is selected from the group consisting of: 

SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:ll), YKR161C (as 
depicted in FIG.19A-B, as set forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in 
FIG.21A-B, as set forth in SEQ ID NO: 15), YPK2(YMR104C) (as depicted in FIG.23A-B, 
as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID 

30 NO: 19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21). 

65. A method for identifying a molecule that activates the PKC pathway 
in yeast comprising contacting a yeast cell with one or more candidate molecules, and 
detecting a change in the protein expression of a reporter gene for the PKC-pathway relative 

35 to the expression of the reporter gene in a yeast cell not contacted by the one or more 
candidate molecules, wherein the reporter gene is selected from the group consisting of: 
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SLT2(YHR030C) (as depicted in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C (as 
depicted in FIG.19A-B, as set forth in SEQ ID NO: 13), PIR3(YKL163W) (as depicted in 
FIG.21 A-B, as set forth in SEQ ID NO:15), YPK2(YMR104C) (as depicted in FIG.23A-B, 
as set forth in SEQ ID NO:17), YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID 
5 NO:19), and ST1(YDR055W) (as depicted in FIG.27A-B, as set forth in SEQ ID NO:21) 

66. The method according to claim 64, wherein the fungal cell is a 
transgenic cell. 

10 67. The method according to claim 65, wherein the fungal cell is a 

transgenic cell. 

68. A method for identifying a molecule that modulates the expression of 
a PKC-pathway gene selected from the group consisting of SLT2(YHR.030C) (as depicted 

15 in FIG.17A-B, as set forth in SEQ ID NO:l 1), YKR161C (as depicted in FIG.19A-B, as set 
forth in SEQ ID NO:13), PIR3(YKL163W) (as depicted in FIG.21 A-B, as set forth in SEQ 
ID NO:15), YPK2(YMR104C) (as depicted in FIG.23A-B, as set forth in SEQ ID NO:17), 
YLR194C (as depicted in FIG.25A-B, as set forth in SEQ ID NO: 19), and ST1(YDR055W) 
(as depicted in FIG.27A-B, as set forth in SEQ ID NO:21), comprising recombinantly 

20 expressing in a fungal cell one or more candidate molecules, and detecting the expression of 
said PKC-pathway gene; wherein an increase or decrease in the gene expression relative to 
the expression in the absence of candidate molecules indicates that the molecules modulates 
PKC-pathway gene expression. 

25 69. The method according to claim 68, wherein the fungal cell is a 

transgenic cell. 

70. A method for identifying a molecule that modulates the activity of a 
PKC-pathway protein selected from the group consisting of SLT2(YHR030C) (as depicted 

30 in FIG.18, as set forth in SEQ ID NO:12), YKR161C (as depicted in FIG.20, as set forth in 
SEQ ID NO:14), PIR3(YKL163W) (as depicted in FIG.22, as set forth in SEQ ID NO:16), 
YPK2(YMR104C) (as depicted in FIG.24, as set forth in SEQ ID NO: 18), YLR194C (as 
depicted in FIG.26, as set forth in SEQ ID NO:20), and ST1(YDR055W) (as depicted in 
FIG.28, as set forth in SEQ ID NO:22), comprising contacting a fungal cell with one or more 

35 candidate molecules, detecting said protein; wherein an increase or decrease in the protein 
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level relative to the level in the absence of candidate molecules indicates that the molecule 
modulates PKC-pathway gene expression. 



71. A method of identifying a molecule that binds to a ligand selected 
5 from the group consisting of (i) an S. cerevisiae PKC-pathway protein selected from the 
group consisting of SLT2(YHR030C) (as depicted in FIG. 18, as set forth in SEQ ID 
NO:12), YKR161C (as depicted in FIG.20, as set forth in SEQ ID NO:14), 
PIR3(YKL163W) (as depicted in FIG.22, as set forth in SEQ ID NO: 16), YPK2(YMR104C) 
(as depicted in FIG.24, as set forth in SEQ ID NO: 18), YLR194C (as depicted in FIG.26, as 
10 set forth in SEQ ID NO:20), and ST1(YDR055W) (as depicted in FIG.28, as set forth in 
SEQ ID NO:22), (ii) a fragment of the £ cerevisiae PKC-pathway protein, and (iii) a nucleic 
acid encoding the S. cerevisiae PKC-pathway protein or fragment, the method comprising: 
(a) contacting the ligand with a plurality of molecules under conditions 
conducive to binding between the ligand and the molecules; and 
15 (b) identifying a molecule within the plurality that binds to the ligand. 



72. A method for determining whether a molecule affects the function or 
activity of an Invasive Growth pathway in a cell comprising: 

(a) contacting the cell with, or recombinantly expressing within a cell the 
20 molecule; and 

(b) determining whether the expression of one or more of the genes 
selected from the group consisting of: KSS1(YGR040W) (as depicted 
in FIG.29, as set forth in SEQ ID NO:23), PGU1(YJR153W) (as 
depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 

25 depicted in FIG.33, as set forth in SEQ ID NO:27), and . 

SVS1(YPL163C) (as depicted in FIG.35, as set forth in SEQ ID 
NO:29), is changed relative to said expression in the absence of the 
molecule. 



30 73. The method according to claim 72 wherein step (b) comprises 

determining whether KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID 
NO:23), expression increases. 



74. A kit comprising in one or more containers a) a substance selected 
35 from the group consisting of an antibody against an Invasive Growth pathway protein, a 
gene probe capable of hybridizing to RNA of an Invasive Growth pathway gene, and pairs of 
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gene primers capable of priming amplification of at least a portion of an Invasive Growth 
pathway gene, and b) a molecule known to be capable of perturbing the Invasive Growth 
pathway. 

75. A method for identifying a molecule that activates the Invasive 
Growth pathway in yeast comprising contacting a yeast cell with one or more candidate 
molecules, and detecting a change in the RNA expression of a reporter gene for the Invasive 
Growth pathway relative to the expression of the reporter gene in a yeast cell not contacted 
by the one or more candidate molecules, wherein the reporter gene is selected from the 
group consisting of KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID 
NO:23), PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), 
YRL042C (as depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as 
depicted in FIG.35, as set forth in SEQ ID NO:29). 

76. A method for identifying a molecule that activates the Invasive 
Growth pathway in yeast comprising contacting a yeast cell with one or more candidate 
molecules, and detecting a change in the protein expression of a reporter gene for the 
Invasive Growth pathway relative to the expression of the reporter gene in a yeast cell not 
contacted by the one or more candidate molecules, wherein the reporter gene is selected 
from the group consisting of: KSS1(YGR040W) (as depicted in FIG.29, as set forth in 
SEQ ID NO:23), PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), 
YRL042C (as depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as 
depicted in FIG.35, as set forth in SEQ ID NO:29). 

77. The method according to claim 75, wherein the fungal cell is a 

transgenic cell. 

78. The method according to claim 76, wherein the fungal cell is a 

transgenic cell. 
30 - 

79. A method for identifying a molecule that modulates the expression of 
an Invasive Growth pathway gene selected from the group consisting of KSS1(YGR040W) 
(as depicted in FIG.29, as set forth in SEQ ID NO:23), PGU1(YJR153W) (as depicted in 
FIG.31, as set forth in SEQ ID NO:25), YRL042C (as depicted in FIG.33, as set forth in 

35 SEQ ID NO:27), and SVS1(YPL163C) (as depicted in FIG.35, as set forth in SEQ ID 
NO:29), comprising recombinantly expressing in a fungal cell one or more candidate 
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molecules, and detecting the expression of said Invasive Growth pathway gene; wherein an 
increase or decrease in the gene expression relative to the expression in the absence of 
candidate molecules indicates that the molecules modulates Invasive Growth pathway gene 
expression. 

5 

80. The method according to claim 79, wherein the fungal cell is a 

transgenic cell. 

81. A method for identifying a molecule that modulates the activity of an 
10 Invasive Growth pathway protein selected from the group consisting of KSS1(YGR040W) 

(as depicted in FIG.30, as set forth in SEQ ID NO:24), PGU1(YJR153W) (as depicted in 
FIG.32, as set forth in SEQ ID NO:26), YRL042C (as depicted in FIG.34, as set forth in 
SEQ ID NO:28), and SVS1(YPL163C) (as depicted in FIG.36, as set forth in SEQ ID 
NO:30), comprising contacting a fungal cell with one or more candidate molecules, 
15 detecting said protein; wherein an increase or decrease in the protein level relative to the 
level in the absence of candidate molecules indicates that the molecule modulates Invasive 
Growth pathway gene expression. 

82. A method of identifying a molecule that binds to a ligand selected 
20 from the group consisting of (i) an S. cerevisiae Invasive Growth pathway protein selected 

from the group consisting of KSS1(YGR040W) (as depicted in FIG.30, as set forth in SEQ 
ID NO:24), PGU1(YJR153W) (as depicted in FIG.32, as set forth in SEQ ID NO:26), 
YRL042C (as depicted in FIG.34, as set forth in SEQ ID NO:28), and SVS1(YPL163C) (as 
depicted in FIG.36, as set forth in SEQ ID NO:30), (ii) a fragment of the S. cerevisiae 
25 Invasive Growth pathway protein, and (iii) a nucleic acid encoding the S. cerevisiae Invasive 
Growth pathway protein or fragment, the method comprising: 

(a) contacting the ligand with a plurality of molecules under conditions 
conducive to binding between the ligand and the molecules; and 

(b) identifying a molecule within the plurality that binds to the ligand. 

30 

83 . The method of claim 1 , wherein the reporter gene is a reporter for the 
Invasive Growth pathway, and the reporter gene selected from the group consisting of 
KSS1(YGR040W) (as depicted in FIG.29, as set forth in SEQ ID NO:23), 
PGU1(YJR153W) (as depicted in FIG.31, as set forth in SEQ ID NO:25), YRL042C (as 

35 depicted in FIG.33, as set forth in SEQ ID NO:27), and SVS1(YPL163C) (as depicted in 
FIG.35, as set forth in SEQ ID NO:29). 
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YHR039C DNA sequence, including the coding region, 350 
bp of ifpstream sequence, and 100 bp of downstream sequence. 

TCTAGTTTTC TAATCATATA TCTTTTTATA ATATAATACC AATAGAATAA 51 
AAATGTATAA ACTGACATTG CATTCGGTCT TTACGACTCT CGCTTTATCC 101 
ATTCAGCCTT TTTTTTTTTT TTTTTTTTTT CTCTATCTGC TAAACGAGTA 151 
GTAGTATAAT CAAAAATGTG TTATTTAGTA TATCGGTTGT AAAGGAGAAA 201 
GTATGGTCTC TCTATTTTTA TTTTATTAAC GAAAAATACT AAACGCCGAT 251 
GGGGATTACT ATATAATTAT AATAGTATTT GCAGAATAGT AGAATTCTTT 301 
TCACAGTTCA CGTTCAGTTT CTCCTCTGTT TTATCGAACG TTT ATT CATC 351 

ATGTCCAAGG TCTATCTGAA TTCAGACATG ATTAACCATT TGAACTCCAC 401 
AGTT CAAGCT TACTTTAACT TATGGTTGGA GAAGCAAAAC GCAATAATGC 451 
GTTCTCAACC CCAAATTATT CAAGATAACC AAAAACTGAT AGGCATTACA 501 
ACGCTAGTTG CCTCAATTTT CACTCTGTAT GTTTTGGTCA AG ATAAT CTC 551 
CACCCCAGCA AAGTGTTCCT CGTCCTATAA GCCAGTCAAA TTCTCCCTTC 601 
CTGCACCAGA GGCCGCTCAA * AATAATTGGA AGGGCAAGAG GTCTGTTTCC 651 
ACTAACATAT GGAATCCTGA AGAACCAAAC TTTATTCAAT GTCATTGTCC 701 
CGCCACAGGT CAATATCTAG GTTCTTTTCC ATCGAAAACG GAAGCTGACA 751 
TAGATGAAAT GGTTTCTAAG GCAGGCAAAG CTCAATCTAC TTGGGGCAAT 801 
TCTGATTTCT CAAGAAGATT GAGAGTTTTG GCTTCTTTGC ATGATTATAT 851 
TCTAAATAAT CAAGATCTTA TTGCGAGAGT AGCGTGCAGG GATTCAGGAA 901 
AGACAATGTT AGACGCATCG ATGGGTGAAA TCTTGGTTAC TTTAGAAAAA 951 
ATTCAATGGA CTATAAAGCA CGGCCAAAGA GCGTTGCAAC CTTCGAGACG 1001 
TCCGGGCCCC ACTAATTTTT TCATGAAGTG GTATAAAGGT GCAGAAATCC 1051 
GTTATGAACC ACTGGGTGTG ATCAGTTCTA TCGTTTCCTG GAACTATCCA 1101 
TTCCATAACT TATTGGGTCC AATTATTGCA GCATTGTTCA CAGGGAATGC 1151 
CATTGTAGTA AAATGTTCAG AACAAGTTGT CTGGTCTTCG GAATTTTTCG 1201 
TCG AG CTGAT CCGCAAATGT TTGGAAGCTT GTGATGAAGA TCCAGATTTG 1251 
GTTCAGTTGT GCTATTGTTT ACCTCCAACT GAAAATGATG ATTCCGCAAA 1301 
TTATTTCACC TCTCATCCTG GTTTCAAACA TATCACTTTT ATTGGCAGTC 1351 
AGCCCGTAGC GCACTATATT CTAAAATGCG CTGCCAAATC ATTGACACCC 1401 
GTAGTTGTGG AGCTTGGTGG TAAGGATGCG TTTATTGTCC TAGACT CAGC 1451 
TAAGAATTTA GATGCTTTAT CTTCTATCAT CATGAGGGGT ACTTTCCAAT 1501 
CATCCGGTCA AAATTGTATT GGTATTGAGA GGGTTATTGT CAGTAAGGAA 1551 
AATTATGATG ATTTAGTCAA GATTTTGAAT GACCGTATGA CTGCAAATCC 1601 
ACTACGCCAA GGGTCTGATA TTGATCATTT AGAAAATGTT GATATGGGGG 1651 
CAATGATATC TGACAACAGA TTCGATGAAC TAGAAGCTTT GGTTAAAGAT 1701 
GCTGTTGCAA AGGGAGCTCG TTTACTTCAA GGTGGTTCCC GCTTCAAACA 1751 
TCCAAAGTAT CCACAAGGTC ATTATTTCCA ACCAACTCTT TTGGTGGATG 1801 
TCACTCCAGA AATGAAAATA GCACAAAACG AAGTGTTTGG CCCAATTTTA 1851 
GTCATGATGA AAGCTAAGAA TACTGACCAT TGTGTACAAC TAGCCAACTC 1901 
TGCGCCATTT GGTCTAGGTG GTTCTGTGTT TGGTGCGGAT ATCAAGGAAT 1951 
GCAATTACGT CGCAAATAGC CTACAAACTG GTAATGTAGC CATTAATGAT 2001 
TTTGCTACAT TCTATGTTTG TCAATTACCA TTTGGTGGTA TCAATGGTTC 2051 
AGGTTACGGT AAATTTGGTG GTGAAGAAGG TCTTTTGGGT TTGTGCAATG 2101 
CCAAAAGTGT CTGTTTTGAT ACTTTGCCTT TTGTCTCCAC TCAAATTCCA 2151 
AAACCATTAG ACTACCCTAT TCGTAACAAT GCTAAGGCTT GGAATTTTGT 2201 
AAAGAGTTTC ATCGTA.GGAG CTTATACAAA TTCCACATGG CAAAGAATAA 2251 
AGTCACTGTT CTCTTTAGCT AAAGAAGCCA GCTAG 

TTTAC TTTAGAGGAA 2301 

GCAACAAACT TATCAATAAT TTGGTATTTA TTATTATATA AAATGAACTT 2351 
TTTATGTACA AGATTTATGA TTTTTTGATT CTATA 
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Yhr039c protein sequence. 

MSK\mJNSDMINHLNSTVQAYFNLWLEKQNAIMRSQPQI IQDNQKLIGITTLVAS IFTLY 
VLVKIISTPAKCSSSYKPVKFSIiPAPEAAQNNVnCGKRSVSTNIWNPEEPNFIQCHCPATG 
QYIXSSFPSKTEADIDEMVSKAGKAQSTWGNSDFSRRLRVIjASLHDYIIjNNQDLIARVACR 
DSGKTMLDASMGEIIjVTLEKIQWTIKHGQRALQPSRR 

ISSIVSWNYPFHNLIiGPIIAAIjFTGNAIVVKCSEQVVWSSEFFVELIRKCLEACDEDPDL 

vqlcyclpptenddsanyftshpgfkhitfigsqpvj^ 

FIVLDSAKNIjDAIjSSIIMRGTFQSSGQNCIGIERVIVSKENYIDDLVKIIjNDRMTANPLRQ 
GSD I DHIjENVDMGAMI SDNRFDELEALVKDAVAKGARLIjQGGSRFKHPKYPQGHYFQPTL 
LVDVTPEMKIAQNE\nF , GPIIjVMMKAKNTDHCVQLANSAPFGIjGGSVFGAD 
LQTGNVAINDFATFYVCQLPFGGINGSGYGKFGGEEGLLGLCNAKSVCFDTLPFVSTQIP 
KPLDYP I RNNAKAWNFVKS F I VGAYTNSTWQRI KSLFS LAKE AS 
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YLR100W DNA Sequence, including the coding sequence 
(with the start and stop codons in bold) , plus 800 bp of upstream 
sequence and 100 bp of downstream sequence . 

ACGTACAAAA AAGAGCACGC TGCTTTATTT ATACTTTTGT GCCACAAGAA 51 
TGATCAACAT CAACATAAAT ATCAACTAGT ATCTGCAACA CATCTGCTCC 101 
ACGGAACTAA ACCCGTTGAG CAGTGCCCCG TGGAAACGTA AACTATCGCA 151 
AATTGGGATT AACAAGCCAA AAACAGCCAA G CAAGATTCA CGAAACCGCG 201 
CCTCGTTTGG ACCCCGAAGG CCCATTTAAC GGCCGGCCGT TACAAGCAAG 251 
ATCGGCAGAG CAAACCACTC CCCAGCACCA CAGCACATCA CTGCACGAGC 301 
AACAATAACT AGAACATGGC AGATAGCGAG GATACCTCTG TGATCCTGCA 3 51 
GGGCATCGAC ACAATCAACA GCGTGGAGGG CCTGGAAGAA GATGGTTACC 401 
TCAGCGACGA GGACACGTCA . CTCAGCAACG AGCTCGCAGA TGCACAGCGT 451 
CAATGGGAAG AGTCGCTGCA ACAGTTGAAC AAGCTGCTCA ACTGGGTCCT 501 
GCTGCCCCTG CTGGGCAAGT ATATAGGTAG GAGAATGGCC AAGACTCTAT 551 
GGAGTAGGTT CATTGAACAC TTTGTATAAG TGTTTGTTGT TTATGTATCC 601 
GCATATAGCA GTTATAACAG ATAAATGGCA CTTTTCGCAC ACCCGTTGTT 651 
TTATCTCCGA TAGTACGTGG GCCTTTATTT ATGGTCGTTT AACGAAAGAA 701 
CGGCATCTTG AATTG AG CAG GTATTTAAAA GATAGGACGA GAAACAAGCA 751 
CATGATCTGT GTCGAAAAAA AGTAGCAAAG AGAAAAAGTA GGAGGATAGG 801 

ATGAACAGGA AAGTAGCTAT CGTAACGGGT ACTAATAGTA ATCTTGGTCT 851 
GAACATTGTG TTCCGTCTGA TTGAAACTGA GGACACCAAT GTCAGATTGA 901 
CCATTGTGGT GACTTCTAGA ACGCTTCCTC GAGTGCAGGA GGTGATTAAC 951 
CAGATTAAAG ATTTTTACAA CAAATCAGGC CGTGTAGAGG ATTTGGAAAT 1001 
AGACTTTGAT TATCTGTTGG TGGACTTCAC CAACATGGTG AGTGTCTTGA 1051 
ACGCATATTA CGACATCAAC AAAAAGTACA GGGCGATAAA CTACCTTTTC 1101 
GTGAATGCTG CGCAAGGTAT CTTTGACGGT ATAGATTGGA TCGGAGCGGT 1151 
CAAGGAGGTT TTCACCAATC CATTGGAGGC AGTGACAAAT CCGACATACA 1201 
AGATACAACT GGTGGGCGTC AAGTCTAAAG ATGACATGGG GCTTATTTTC 1251 
CAGGCCAATG TGTTTGGTCC GTACTACTTT ATCAGTAAAA TTCTGCCTCA 1301 
ATTGACCAGG GGAAAGGCTT ATATTGTTTG GATTTCGAGT ATTATGTCCG 1351 
ATCCTAAGTA TCTTTCGTTG AACGATATTG AACTACTAAA GACAAATGCC 1401 
TCTTATGAGG GCTCCAAGCG TTTAGTTGAT TTACTGCATT TGGCCACCTA 1451 
CAAAGACTTG AAAAAGCTGG GCATAAATCA GTATGTAGTT CAACCGGGCA 1501 
TATTTACAAG CCATTCCTTC TCCGAATATT TGAATTTTTT CACCTATTTC 1551 
GGCATGCTAT GCTTGTTCTA TTTGGCCAGG CTGTTGGGGT CTCCATGGCA 1601 
CAATATTGAT GGTTATAAAG CTGCCAATGC CCCAGTATAC GTAACTAGAT 1651 
TGGCCAATCC AAACTTTGAG AAACAAGACG TAAAATACGG TTCTGCTACC 1701 
TCTAGGGATG GTATGCCATA TATCAAGACG CAGGAAATAG ACCCTACTGG 1751 
AATGTCTGAT GTCTTCGCTT ATATACAGAA GAAGAAACTG GAATGGGACG 1801 
AGAAACTGAA AGATCAAATT GTTGAAACTA GAACCCCCAT TTAA 

TATATC 1851 

TCTGCGTACA TATGTATATA TATATATGTG TGTATATACA TGTATGTCTG 1901 
TATAGAAAAC GCATATCAAC TGATATATAT ACACGTGAAG CAAA 
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YlrlOOw protein sequence. 
MNRKVAIVTGTNSNIX3IiNIVPRIjIETEDT 

RVEDLE I DFD YLIiVDFTNMVS VIiNAYYD INKKYRAI NYLFVNAAQGI FDG I DW IGAVKEV 
FTNPIiEAVTNPTYKIQIiVGVKSKDDMGLIFQANVFGPYYFISKILPQL.TRGKAYIVWISS 
IMSDPKYLSLNDIELLKTNASYEGSKRLTO^ 
SEYLNFFTYFGMLCLFYIARLIXJSPWHN^ 

S RDGM P Y I KTQE I DPTGMSDVFAY I QKKKLEWDEKLiKDQ I VETRT P I 
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YFI*272c DNA seguenoe 

1 GATGGCAAAC CTCCGCAATG ATTGGCGTTC TAGCGGCTAT CCGAATTCAC 

51 AATCGACAAG AAGTACTTCT AACTTACACA AGGCAACGAA ATAATATCAC 

101 TCTATGAAAC TGCCATTTGG GTAATAGGAG TATATTGAAC GACACCGGGT 

151 CAACAAGCAA CTTTCCTAAG CCTTTTACAC TTCTTCACAT CATTCAAGAT 

201 CGCCTTTTAA CGAGCTACAA ACCTTCACGT TCGTTCTTCT ATGGAAACGT 

251 TTAAGATAAC GTTAAAACGT TCTCAATCAC AGAATTTAAG ATGATTAGAA 

3 01 ATGTTTTCCA AGGGATAGGG CGAAGCACAA CCTCGAAAAA TGGCAAAATT 

351 TTAGAATCTT AGCCACCTTA ACGTCTACTT AGAGCCTTAG AAAAGCCATC 

401 AAGATTGGTG GAATAGTTGT TGAGGGAACT TAGCCGCCAC ATTCTCGTAG 

451 CCAAATAAAG CGAATCTGAC CATTGTATGT TTCTTTTTCA CTGGTATGAT 

501 AGCCCAATGT GTTTAAGGAA AGTTAGGACA ACACACCCGA AGAAGGACGT 

551 CACCCCTGCA TTCCCAAACG AGCTATGAAA TAGCTCTTTC CTCTACAAGT 

601 AATAACAACA ACTTTTTTGT CTGTTTTCCG ACCGTTTAAC TTCAGAGATT 

651 AATTTTTTCA ACGCGCTTTC GTTGAACGTC GCAAATTCGT TTAGAATAAA 

701 CGAAAGGTGA CAGAAATAGA AGATTATAGC CATGCATACG CACATAAATT 

751 GAAAACTGTT TCGAGGCTGA GTATTCCCTG CGTCTGCAGC CATCAGGGGT 

801 ATGACTCTGC TACACGTTTA CTATATTCTT GGCTAAACGA TTCATTAACG 

851 AAGCGATGAG TAGATCACAC TCGGCATACG AGCACAAATT TGTATGGGGG 

901 GACGGTCATA TATAAAAGGG TGTATACGTT ATCCTTGTTA TACCTGTCCA 

951 AAGAAGTGCA TTTGTAACTC ACAACACAGA CACATCCTCA CTTTATCATA 

1001 ATGACTACGT TTAGGCCACT ATCAAGTTTT GAAAAAAAAA TTCTCACTCA 

1051 ATCTTTGAAT GACCAAAGAA ATGGAACTAT TTTTTCGAGT ACATATTCAA 

1101 AATCTTTAAG TAGAGAAAAT GACGCTGACT GGCATTCTGA TGAAGTCACG 

1151 CTCGGAACAA ATTCTTCCAA AGATGATTCT CGTCTGACTC TGCCCCTAAT 

1201 AGCAACAACT TTGAAGAGAT TGATTAAATC GCAACCGGCA TTGTTTGCAA 

1251 CTGTAAACGA AGAATGGGAA TTCGAGCCAT TGAAGCAGCT GAAAACTTCC 

13 01 GATATTGTTA ATGTGATTGA GTTTGAAACC ATAAAAGATA AGGAGGTCAA 

1351 TTGCCATTGG GGTGTTCCAC CTCCTTATCT CTTGCGTCAT GCCTTCAACA 

1401 AGACTAGATT TGTTCCCGGA TCAAATAAAC CTTTATGGAC ACTATATGTA 

1451 ATTGACGAAG CGCTATTGGT TTTTCATGGT CACGACGTAT TGTTTGATAT 

1501 ATTTTCAGCA GCTAACTTTC ACAAATTATT TTTAAAAGAG TTAAACGAAA 

1551 TCAGCACAGT AACACACTCT GAAGATAGGA TTTTGTTTGA TGTCAATGAC 

1601 ATCAATCTCT CAGAATTAAA ATTTCCCAAA TCGATATATG ATAGCGCAAA 

1651 ATTACACCTG CCCGCTATGA CACCACAAAT CTTCCACAAG CAAACTCAGT 

1701 CATTTTTCAA ATCAATATAC TATAACACTT TAAAAAGACC TTTCGGCTAT 

1751 TTAACCAATC AAACTTCCCT CAGCTCGTCA GTATCTGCAA CACAGCTGAA 

1801 AAAGTATAAT GATATTCTAA ATGCGCACAC CTCATTATGC GGGACAACAG 

1851 TATTTGGGAT AGTAAACAAC CAAAGGTTTA ACTATTTAAA GTCAATCGTT 

1901 AATCAAGAGC ATATATGTCT AAGAAGTTTC ATCTGTGGTA TTGCAATGAT 

1951 ATGTTTAAAA CCTCTCGTTA AGGATTTCAG CGGTACAATA GTATTTACTA 

2001 TTCCCATAAA TTTAAGAAAC CACTTAGGCT TAGGTGGGTC ATTGGGTCTC 

2051 TTCTTCAAAG AACTAAGGGT CGAATGTCCA CTTTCTCTAA TTGATGACGA 

2101 ACTTTCCGCC AACGAATTTT TGACCAACAG TAACGATAAC GAGGATAATG 

2151 ATGATGAGTT TAATGAAAGA TTGATGGAAT ATCAATTTAA TAAAGTTACA 

2201 AAGCACGTTA GCGGTTTTAT TATGGCAAAA CTGAGGAGTT GGGAAAAGAA 

2251 TGGGTTTAAT GATGACGATA TAAGGAGGAT GAAGTATGAC AATGACGACG 

2301 ATTTCCATAT CCAAAATTCA AGGACAAAAT TGATTCAAAT CAATGATGTT 

2351 TCCGACATAT CGTTATCGAT GAACGGCGAT GACAAATCTT TCAAAATTGT 

2401 AAGTACGGGA TTTACAAGTT CGATAAATCG CCCCACATTA ATGTCTCTTT 
2451 CCTATACATA CTGTGAAGAG ATGGGCCTGA ATATCTGTAT TCACTACCCT 
2501 GATTCGTATA ATTTAGAATC TTTTGTAGAA TGCTTCGAAT CCTTTATTGA 

2551 ATAG GCAGGT GACGCATTAA ATATATGTCT GTATAGTACG TATTTTTTCC 
2601 ATTTTATTTA TTCTTATCAA AATTTAATCA ACATATATGC TAAAGAAACT 
2651 ATTGATAGGA GATATGACAG GAAATTGCAC TGTTTCTGGA ACTTTGGCAT 
2701 GCCGAGGCCG TCATTTCCAG TATAACTGAG CAAAAAGAAG TGACGGTAAA 
2751 TACA 
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YPL272C protein sequence. 

MTTFRPLSSFEKKILTQSLNDQRNGTIFSSTYSKSLSRENDADWHSDEVTLGTNS 
SKDDSRLTLPLIATTLKRLIKSQPAIiFATVNEEWEFEPLKQLKTSDIVNVIEFET 
IKDKEVNCHWGVPPPYLLRHAFNKTRFVPGSNKPLWTLYVIDEALLVFHGHDVLiF 
DI FSAANFHKLFLKELNE I STVTHSEDRI LFDVND INLS ELKFPKS I YDSAKLHL 
PAMT PQI FHKQTQS FFKS I YYNTLKRP FGYLTNQTS LS S S VS ATQLKKYND I LNA 
HTSLCGTTVFGI VNNQRFNYLKS I VNQEHI CLRSF I CGI AMI CLKPLVKDFSGTI 
VFTIPINLRNHLGLGGSLGLFFKELRVECPLSLIDDELSANEFLTNSNDNEDNDD 
EFNERLMEYQFNKVTKHVSGFIMAKLRSWEKNGFNDDDIRRMKYDNDDDFHIQNS 
RTKLIQINDVSDISLSMNGDDKSFKIVSTGFTSSINRPTLMSLSYTYCEEMGLNI 
CIHYPDSYNLESFVECFESFIE 
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YGK131W DNA Sequence 

1 TGCAAAAACT GATAAGGGCT TTCCTGCTGA TGCGCTTGCT GATTTTGCGT 

51 ATTTGCCGAA GATTGATTGA TCAATTGCGT AAAGGGGTCG TCTTCTTGAC 

101 GGTTGATATT GAATAGCATG TTTTGAATAC GTAGTTGATT GACCTCTTTC 

151 TTTTAATTGC GTGCAGCTGC TCTCAGGTTT AAGATGTACG AGGGTCCACG 

201 GGGTAGCAAG CACAAGAACG ATGATATATA TGACAGAACG ATGGATAAGA 

251 ATGGTATGTT GTCTGCACTG TTCAGCATTC GACTACCCCT CTCCCGGTTC 

301 TTTTCTCCTC GTTTCAATTT AAAAAAGCAA CTCGCTACCC GGCCGCACAC 

351 CCCTTATTCC TGTTCAGCCG TTTAAGGTGA GAACCCTTTA CTTCATAGCC 

401 TTTGTAGATC TTTCTATTGC TACCATTGAA GGGTCGGTGA CGTGGAAATT 

451 TTGACATTTA TCAGTGGCGT ATTGGGAGGC AAGCAATTGA AAGAACTGTG 

501 ATTTATTTCC GCTTGTTCGA AATTATTGAT GTTTAGCACT TTGCAGTAGC 

551 GACAATACAA TATATGTGCT TTTAGTGCTG GGATAGTTCG TAGCTCCATT 

601 TCGGGGCGCT TGTTACATTT ATTGTATATG CGCGGATGTG GCACATGCTG 

651 TTGAGATCTC ACTCCTTTGG TATCTCTTTC CTGCGCCGCA TTGTGCCGGC 

701 AGAATGTCGC GCTTGTATTC TCATGAACTT TTCCTCTTTA CGAACCCTTT 

751 GGCGGCATGC CGTTTAAAAT CTGTTGAAGA TTTCCTTTAC GAACAATGAG 

801 CAATGTTTTG CACAGGCAGG TGGGAAGTAG GGCCTATCGC GCCTTGGATG 

851 CAGATATAAG TATAAATATA AATTATAATA ATTGGCTGTA TCAGTAAATC 

901 CTTCTTGCGA TGGGAGGAAG CACGATAGAG TATGTTAAGC TTTTGAGAGG 

951 CTTCATATTC ATTGGAATTT TAAATAACAA TAAAGCAACA ACAATAATAA 

1001 ATGCTATCAG CTGCAGATAA TTTAGTGCGC ATCATAAATG CTGTTTTTCT 

1051 TATTATATCC ATAGGTCTAA TCAGCGGCCT GATAGGTACA CAGACAAAGC 

1101 ATAGTTCTCG AGTGAACTTT TGTATGTTTG CCGCCGTTTA TGGTCTGGTT 

1151 ACGGATTCAT TATATGGGTT TTTGGCTAAT TTCTGGACAT CATTAACATA 

1201 CCCAGCAATT TTGCTTGTTT TGGATTTTTT AAATTTCATA TTTACGTTTG 

1251 TAGCAGCCAC CGCTTTGGCT GTAGGTATAA GATGCCATTC GTGTAAAAAC 

1301 AAAACATATC TGGAACAGAA TAAGATCATA CAAGGCTCAA GCTCCAGATG 

1351 TCATCAATCT CAGGCTGCTG TTGCGTTTTT TTACTTTTCC TGTTTTCTAT 

1401 TCCTCATCAA AGTGACTGTG GCCACGATGG GTATGATGCA AAATGGTGGA 

1451 TTTGGCTCTA ATACCGGATT CAGCAGAAGG AGGGCAAGAA GACAAATGGG 

1501 CATACCTACA ATTTCCCAGG TTTAA 

1526 GCCTA CTGGACTGAA AAAAAGGCAA 

1551 TTCGCGTACA ATTTTCGTTG ATCGTTCTTT ATATAACCTT TGCATTAAAT 

1601 AAATTTAACA AAAAAAGTTC TTTCTAAAAT AATATTATGG TGATACATGA 

1651 ATGTGCTTTA GTTTTTTCGT AGGCTCATCC ATGTATATAT ATAAATGATA 

1701 AAAAACTAAG TTACGATATT GATAG 
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YGR131W Protein Sequence. 



MLSAADNLVRI INAVFLI ISIGLISGLIGTQTKHSSRWFCMFAAVYGLVTDSLY 
GFLANFWTSLTYPAI LLiVLDFLNF I FTFVAATALAYG IRCHS CKNKTYLiEQNKI I 
QGSSSRCHQSQAAVAFFYFSCFLFLIKVTVATMGMMQNGGFGSNTGFSRRRARRQ 
MGIPTISQV 
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YDR453C DNA Socjueace 



1 

51 
101 
151 
201 
251 
301 

351 

401 

451 

501 

551 

601 

651 

701 

751 

801 

851 

901 

951 

1001 

1051 

1101 

1151 

1201 

1251 

1301 

1351 

1401 

1451 

1501 

1551 



1592 
1601 
1651 
1701 
1751 



GTAGATGAAT 
GCAATTCTCC 
TCAGTAAATC 
AAATCTGTCA 
TCCACCATCG 
AGACCGAAGA 
GCATATGCTG 
GGACAAGGCC 
GTAGCCCTAT 
CGCAAAGCGA 
TCAACGGGCT 
AACCATCTCT 
GGCCAGGAAT 
TGGGCCAGTA 
TTTTAGTCAT 
GTTTAGGAGC 
TTGTTACCCG 
TATTTAGAGC 
TTGCTACTTA 
GTTTTTTGCT 



TCAAATCTAT 
GGTAACTACT 
TGGTAAGACT 
AGGCTATCCC 
GTCGAGGATT 
ATCCATCAAC 
AGACAGGTGC 
TACAAGGAAT 
ATAGACATTA 
CAAAACGTCC 
TATGCTAGTT 
TGAATTTCCA 
CTGTTGGTTG 
TCTGATTATC 
TGGGGGTTGG 
GGGCGAGGCT 
AGTAATCAAG 
AGACGATTGT 
CATTGTTTAT 
CAAGAATATA 



GATTAAGAAC 
ATGGTAGTAC 
TGTATTTTAG 
AGAGTTAAAT 
TGAAAAAAAG 
AAGAGGTTAA 
CCATGACAAA 
TGAAGGATTT 
CTAAGTATGT 
AATTATTCAA 
TTTTTTGTTA 
AGTGCCAAAA 
GTCATCCTCA 
TTAACTATAT 
AAGGGCTGAT 
CTCCTTTCTC 
GATCAACTAT 
AAGAATATAT 
CTTGAAATAT 
TTAGCCTTAC 



AATGAATTCA 
TGTCGCTTCC 
ATATTGATAT 
GCCAGGTTTT 
ATTAGAAGGT 
GCGCCGCTCA 
GTTATTGTCA 
TATCTTTGCA 
ACCTGGTAGG 
TTAATATAGT 
GTAAGCGCTA 
TCAATGACCA 
AGATCTAGAC 
GCGCCCCTCT 
CCCCCCTTAC 
TTACACATCT 
GGATGAGATT 
TTTGTAATTT 
CCAAAGTGAA 
AAGAACGTAA 



TTGAATGGGC 
GTCAAACAAG 
GCAGGGTGTC 
TGTTTATTGC 
AGAGGTACGG 
AGCTGAATTG 
ATGATGATTT 
GAAAAATGAT 
AGAGTGCTGT 
GTAAAAGTTC 
CGACGACTAG 
CGGATACTGT 
AATATCATAT 
AGTTTACAAG 
AATTGGCGTC 
GCTAAGGTGT 
TAGATTAACG 
CGATTGTTTT 
CACTATTACT 
AAAACCAATC 



ATGGTAGCAG 

AGTCGACGGT 

ACGTTGTTCT 

GAGATTGTTG 

CCAAGTTTTA 

CCAACCTTCC 

CTTGCTGATA 

AAAAGAAGGT 

TCATTAGACA 

GAAGCTTTGA 

AGTTTTGCCA 

TTAAAGATTC 



AAGTTCAAAA 

ATCTTCGAGG 

AGCTTTTGTC 

CGTTTTCCGA 

TTTGCCTCCA 

CAGAAAAGAC 

AGAATCATTC 

ATAGCTTTAA 

TATCACTATC 

GATTAGTCGA 

TGCAACTGGA 

CAAGGAGTAT 



ACAAGCCCCA 

AAATTTCACT 

CCATTGGCTT 

TGCCGCCAAG 

CCGACTCTGA 

GGTGGATTAG 

CTTATCCAGA 

GAGGTTTGTT 

AATGATTTAT 

AGGTTTCCAG 

CCCCAGGAGC 

TTCAAAAATG 



CCATTTAAGA 

GGAAAAGTAT 

TTTCATTTGT 

AAATTCGAAG 

ATATTCCTTA 

GTCCAGTTAA 

GACTATGGCG 

CATAATCGAC 

CTGTTGGCAG 

TGGACTGACA 

CGCCACCATC 

CCAATAATTA 



AAACCGCCGT 

AAAGGTAAGT 

CTGTCCAACT 

ATCAGGGCGC 

CTGGCATGGA 

AGTTCCTTTG 

TTTTGATTGA 

CCGAAGGGAA 

AAACGTCAAT 

AAAATGGTAC 

AAACCTGACG 

A 



TCTTCGCAC 

GATAACGCTA GGCCCTATTA AATAATTAAA AATACATCAC C CT AT AT ATG 
ATAAGAAAGA TGGTTTTGTA TTATTATGAA ATTGACTTGA AAGAATAGTG 
TAACAAAAGA AAAAGAAACT GTAATTGAAG AATGATATGC ATTTCTATGT 
GTATATTAAC TTAATCATCT TTATATCCAG AAGACGCAAA T 
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YDR453C Protein Sequence. 

MVAEVQKQAPPFKKTAWDGIFEEISLEKYKGKYWLAFVPLAFSFVCPTEIVAF 
SDAAIOCFEDQGAQVLFASTDSEYSLIAWTNLPRKDGGLGPVKVPLLADKNHSLSR 
DYGVLIEKEGIALRGLFI IDPKGI IRHITINDLSVGRNVNEALRLVEGFQWTDKN 
GTVLPCNWTPGAAT I KPDVKDSKE YFKNANN 
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Ergosterol Biosynthetic Pathway 



2 Acetyl CoA 



Mevalonate 



1 ACETYL-CoA 
ACETOACETYt-CoA 



HMa-CoA r - 



ME V AIjOM AT E 

Ef&iz. 1 

UEVALOMATE-P 

E*C« I — 

UCVALONATC-PP 
ISOPEXTENYL-P* ' 



ERG10 

ERG13 

HMG1, HMG2 

ERG12 

ERG8 

ERG19 



Lovastatin, Atorvastatin, etc. 




1 OtUETHYLAU_YL-PP 
I 

tSOPCMTENTL 
tRNA 

GERANYL-PP 



PHENYL ATE D P BOTE IKS 
UBIQUINONE 
OOUCHOC 



Zaragozlc_ 
Acid 



Thiocarbamates 



Squalene 



ERG20 



ERG9 



Clotrimazole 
.:^Hue»tiazple 




Morphollnes 
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SLT2 (YHR030Q DNA Sequence 

(including 1000 bp upstream and 1000 bp downstream of start and stop codons, respectively): 

GTGGTGAAAATGAAGGAAATTTAQ^GATTGTGGATGACGAAGTTGTCATGGACA 

TGAGATTAGTGAGTCGGGTCATTGGTAATCCCTTGTTAAAGGAATCAAAGGAGTT 

TCGTCAAGATTTGAATGCCAGGCCATTAGCTAGATTGGAACGTTTGAAAATCTTG 

ATAAACTATGCAGTTAAGATCTCTCCX3CATAAGGAAAAATTCCCCTATGTGAGGT 

GGACAGTGGGTAAAAACAAGTACATACATGAGCTCATGGTCCCAGAGCGCTTTCC 

CATTGAT ATTCCCAGAGAAAATGTCGGGTTAGAAAGAACT CAGATT CCATTAATG 

CTATGCTGGGCAOTGTCCATTCATAAGGCACAGGGTCAAACTATTCAAAGACTAA 

AGGTCGACI^GAGGAGAATTTTCGAAGCCGGCCAAGTTTATGTTGCACTGTCAAG 

AGCGGTAACT ATGG ACACCI^ACAGGT CCT AAACTTTGAT CCAGGAAAG ATT CG C 

ACCAATGAAAGAGTAAAAGATTTCTATAAACX3TTTAGAAACTTTGAAATGACTTG 

CAACGAATAAATGCATATACTCTAGTTGAAGTTTTCTTTTCTTGTTCTATACAGG 

TTCGAATACTTGTGAGCCTATCTGTATAATTTAACAGAATCCCGAAATATTCATC 

TAGAAGCCATCTATTTAGCTAAGCCTACGTATGCGGCGATTTTTATATTATCTTT 

TTTTTTTTTTATAGAAGACTGCGAAATGTTGGCAGAATGGAAAGTTTCAGTGTTA 

AAAATAGAAACTGAAAAAGGAGATCTAGCCAGGAATATATCGAAAAAAAAAGTGA 

GK3GAAATCAGATCCTACACAAATATTTAGATTTAATTGAAGACCCTGGTCTGCCA 

G AT AT AT AT AT AT ATT AGACGAACTGTGCATT CAGT CAG CAAAT CT AGG CCACAG 

ATTTTCTTATTGAAGCTATCAAAATAGTAGAAATAATTGAAGGGCGTGTATAACA 

ATTCTGGGAGATGGCIK^TAAGATAGAGAGGCATACTTTCAAGGTCTTCAATCAA 

GATTTCAGTGTAGATAAGAGGTTTCAA.CTTATCAAAGA7VA.TAGGGCATGGAGCAT 

ACGG CAT AGTGTGTT CAG CG CGGTTTG CAGAAG CTG CCG AAGAT AC CACAGTTG C 

CATCAAGAAAGTGACAAACGTTTTTTCGAA.GACCTTACTA 

CGTGAGCTAAAGCTTTTGAGACATTTCAGA 

ATGATATGGATATTGTTTTTTATCCAGACG 

TGAGGAACTTATGGAATGTGATATGCA(X!AAATC^ 

ACX3GATGCTCACTATCAAAGTTTCACATACCAAA 

TTCATTCTGCAGATGTCTTGCATCXSTGATTT^ 

TGCIAGATTGTCAATTGAAAATCTGTGATTTTGGGTTAG 

AATCCI\3TCGAAAACAGTCAATTTTTGAC^ 

GAGOTCCXSGAAATAATGTTGAGTTACCAAGGATATACCAAGGCGAT^ 
GTCAGCTGGCIX3TATTTTAGCX3GAGTT^ 

AAGGATTACX3TTAATCAATTGAATCAAATATTACAAGTTTTAGGGACA 
ACGAAACTTTAAGAAGGATTGGTTCrAAAAATGTC 
AGGTTTCATTCCAAAAGTACCriTTTGTCAATTTATACCC^ 
GCATTAGACTTATTGGAGCAAATGCTCGCrGTTTGACCCTCA 

TGG ATG AGG CCCTGGAGCATCCTTACTTGT CT AT ATGG CATGAT C CAGCTGACG A 
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ACCTGTGTGTAGTGAAAAATTCGAATTTAGTTTTGAATCGGTTAATGATATGGAG 

GACTTAAAACAAATGGTTATACAAGAAGTGCAAGATTTCAGGCTGTTTGTGAGAC 

AAC C G CT ATT AG AAG AG CAAAGG CAATT ACAATT ACAG CAG CAG C AACAG CAG CA 

GCAACAGCAACAGCAACAGCAACAGCAGCCTTCAGATGTGGATAATGGCAACGCC 

GCAGCGAGTGAAGAAAATTATCCAAAACAGATGGCCACGTCTAATTCTGTTGCGC 

CACAACAAGAATCATTTGGTATT CACTCC CAAAATTTG C CAAGG CATGATGCAGA 

TTTCCCACCTCGACCTCAAGAGAGTATGATGGAGATGAGACCTGCCACTGGAAAT 

ACCGCAGATATTCCGCCTCAGAATGATAACGGCACGCTTCTAGACCTTGAAAAAG 

AGCTGGAGTTTGGATTAGATAGAAAATATTTTTAGGACAAAAAACTATAAGTAAC 

CGGGGAAGTATAGAATCACCATAGATGTAAGCTTACAGACAATGTGTATATATGA 

TGTATATGAACGTATACAAATATATATATATATACGTGCTCTTGTTGTAGCTCGT 

ATATCAAATTCCTCCTCCGACGCTTATCTTAATCGTACTCCGCGGAAGTTTGTTA 

TCGCCrrCTTGAATTCTTTCTTTTCGTTCATTTATGATTAGTCATCTATAGACAAT 

ATTCATTATTTAAGCACCTAGAATACTAAACTAAATGTCTAAATATGACA CAAGG 

AAGATAAGATAAAAAAAACCAAGCGCTTAGAATATGACTTTAATGGTACCTTTCA 

AACAAGTTGATGTATTCACTGAGAAGCCCTTTATGGGAAATCCAGTAGCAGTAAT 

AAACTTCTTGGAAATTGATGAAAATGAAGTCAGTCAAGAAGAATTGCAGG CAATT 

GCCAACTGGACAAACTTATCAGAAACAACGTTTTTATTTAAACCATCTGATAAAA 

AGTATGATTACAAGTTGAGGATCTTTACTCCAAGAAGTGAATTGCCATTTGCTGG 

TCACCCAACCATTGGTTCATGTAAGGCTTTCCrrTGAGTTCACCAAAAACACCACT 

GCGACrrTCTCTCGTCCAGGAATGTAAAATAGGCGCTGTTCCAATAACAATTAATG 

AGGGACTAATTAGCTTCAAAGCTCCGATGGCTGATTACGAAAGTATATCGAGTGA 

GATGATTGCTGATTATGAAAAAGCX3ATTGGTTTGAAATTCATAAAGCCTCCTGCT 

CTTTTACATACTGGGCCAGAGTGGATCGTGGCGCTAGTA^ 

GCTTC^TGCAAACCCAAATTTTGCrATGCTTGCACACCAGACAA 

CC^TGTGGGAATTATCCTAGaSGGCCCTAAAAAGGAAGCOSCCATCAAAAACTCC 

TACGAAATGAGGGCX3TTTGCTCCGGTGATAAACGTTTATGAAGAT 
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SL.T2 (YHR030C) Protein Sequence. 

MADKIERHTFKVFNQDFSVDKRFQLIKEIGHGAYGIVCSARFAEAAEDTTVAIKK 
\rTNVFSKTIiLCKRSLRELKLLRHFRGHKNITCLYDMDIVFYPDGSINGLYLYEEL 
MECDMHQIIKSGQPLTDAHYQSFTYQILCGLKYIHSADVLHRDLKPGNLLVNADC 
QLKICDFGLARGYSENPVENSQFLTEYVATRWYRAPEIMLSYQGYTKAIDVWSAG 
CIIAEFI/MKPIFKGKDYWQLNQILQVLGTPPDETLRRIGSKNVQDYIHQLGFI 
PKVPFVNLYPNANSQALDLLEQMLAFDPQKRITVDEALEHPYLSIWHDPADEPVC 
SEKFEFSFESVNDMEDIiKQMVIQEVQDFRIiFVRQPLLEEQRQLQLQQQQQQQQQQ 
QQQQQQPSDVDNGNAAASEENYPKQMATSNSVAPQQESFGIHSQNLPRHDADFPP 
RPQE SMMEMR P ATGNTAD I P PQNDNGTLLDLE KELE FGLDRKYF 
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YKL161C DNA Sequence 

(including 1000 bp upstream and 1000 bp downstream of start and stop codons, respectively): 

AGTAATACTTGCAAATATTGCAAAACTT^ 

GACCATTCTTTGAATCATCT 

AGACTATGCCCTAATTTCQ\ATGTTATTT^ 

CAGGAAACTCAGGCCCACATCCGCAAAAAAATATGTGCC^^ 

TTCAAAGATACH'TACCACTGCAGKSAA 

AATTTGACTTGATAATTGGACATAAGTACTCCATCGCCATCCCTTTTTAAAGAAG 
TTTCC^CAAGAATGAATGGCTAATTO 

ACAGTATCGACATTTTCTTACTCAATCCAACGAAGGAATAACCTATCT 
AAACGCCGTAGTTTTCAGCCCACAAGACXSTCATTAAAAGATTTGTTAATTATAA^ 
AATAGAAATATTTCTACCAGCATGATTATTCX3TTACTTGAAAGTCCCCAATAAAT 
TTCACTGTTTCCX3TTAACTGTTGTAGTTATTAAACX3 

AACAACACCGGAGAAACACGCGCAGACCCATTCX3AGTTAAAAATAGTAACT 

ATCAATCAATGCAGGAAGCACCGTAGGAATTAGTAAGAACTCGTATTTTG 

AAATGCCATGAAAGCAATTGACTTGCTGCAGT^ 

CAAAAGAAGGTACTTTTATGATGTTATACTAGGCAAAAAGCCTATTTAATGTAA^ 

TCCTAATTGTCGTTTGAGACTGGATGAAAAGGGACAAAATGGAAGGATAACTAAA 

GGTGACTTACCGCCAGATTAATTCGGCCTGGAATAGTTTGATATCGAAGAAAGAT 

TCAO^TTAAATGGCGACTGACACCGAGAGGTGTATTTTCCGTGC^ 

GATTTTATCCTAAATAAACATTTTC^ 

ACAGCCTTATTTGTTCTTCAACTTACACAGAAT 

TATCAGAAAAATACCAAACGCX5TTTGGCAATAAACT 

CGTGAATTGAAACTACTAAGACATTTAAGAGO 

TCGATACTGATATAGTATTTTACCC^^ 

TGAAGAACTAATGGAATGTGACCTTTCT 

GAAGACXSCACACTTTCAAAGCTTC^ 

TACATTCTGCTAATGTTTTACATTGTGACC^ 

TAGTGATTGCCAACTAAAAATTTGTAATTTTGGGCTATC^ 

AACCACAAGGTTAACGACXX3CTT 

AAGCACCAGAAATTTTGCTGAATTATCAAGAATGCACAA^ 

GTCAACAGGCTCTATCTTGGCCGAACTACTTG 

AAGGATTATGTAGATCATTTGAATCATATTC 

AGGAAACATTGCAGGAAATTGCCTCTCAAAAGK3TG 

CGGTAATATCCCGGGAAGATCXSTTTGAAAGCATACTACCTX^ 
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GCGCTTGAATTGCTAAAGAAAATGCTAGAATTTGATCCTAAAAAAAGGATTACTG 

TAGAGGATGCACTAGAGCATCCATATTTGTCAATGTGGCATGATATAGATGAC^ 

ATTCTC^TGTCAAAAGACCTTTAGAT^ 

GAATTAGGAAACX3AAGTTATAAAGGAAGTATTTGATTTCAGGAAAGTTGTTAGAA 

AACATCCTATTAGCXK3TGATTCCCCATCATCATCACTATCT 

TCCTCAAGAAGTTGTACAGGTCCATCCT^ 

CCTGAATTTTCCTATGTAAGCCAACTT 

AAAACCrTTATGGGAATAAGCTCTAATTCATTTCA 

CACCITCAAACAAGATACTAAGCATGAAAATAGTGAACT 

GAGCCAAATATAACAAAAATGAGCCCAGTTTCATCGTCTCCCCCAGGT 

TAAATGTCAATGATGGTACAAACCAAAATACAAATGAGGATGACAGCGATTTTTT 

CITCXSACCTAGAAAAAGAACT^ 

AACCACAACTAATAGATGCGCACATACACT 

TGTTCACCTTCTTAATTATTC^ 

CTCTCX^CX^TTTTGCAGTTCCTTCCGAAAGCGG^ 

GTAGGATACACCATTGCGTAGATTCGCGATGATCCX3AATATAAACATGATTCCCT 

CGTCAGTCCTCTCTCAAGTTTTCTTTCCCGTTTTAAA.TAGCTTA 

ACAAAAAAGTTGATATCATTTAAAGGTGCTTTT^ 

ATTACACCCCTTGAGAATTCAAGTTCATCTGAAATOT 

TTCGAGCAATTACTCTCTACAAATGGGATAAGAA^ 

CTTTGAAAGATTACATAGAGTGGCAAAATT 

TTTTTTTAOSC^^GGAAGCCTGT 

TTTGATAAC^TTCTTGACrGTGAGCCA 

TACTGGTTAATTACAAATTAAATGACTATCCri^ 

TATTTACACGGATTTACCCCAAGCAATT 

CTCAAGTCTACTTTATCTGATAACA 

CT 
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YKL161C Protein Sequence. 

MATDTERCI FRAFGQDFILNKHFHLTGKIGRGSHSLI CSSTYTESNEETHVAIRK 
IPNAFGNKLSCKRTLRELKLLRHLRGHPNIVWLFDTDIVFYPNGALNGVYLYEEL 
MECDLSQI I RS EQRLEDAHFQS F I YQ I LCALKY I HS ANVLHCDLKPKNLLVNSDC 
QLKI CNFGLSCS YSENHKVNDGF I KGYITS I WYKAPE I L.LNYQECTKAVD I WSTG 
CILAELLGRKPMFEGKDYVDHLNHILQILGTPPEETLQEIASQKVYNYIFQFGNI 
PGRS FES I LPGANPEALELLKKMLE FD PKKRI TVED ALEHP YLSMWHD I DEEFS C 
QKTFRFEFEHI ESMAELGNEVI KEVFDFRKWRKH P I SGDS PS S SLSLEDAI PQE 
VVQVHPSRKVLPSYSPEFSYVSQLPSLTTTQPYQNIjMGISSNSFQGVN 
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PIR3 (YKL163W) DNA Sequence 

(including 1000 bp upstream and 1000 bp downstream of start and stop codons, respectively): 

TCTGGCTTCGAGGAATTATTACCTAAATAGGAAAGGCAGAATATATTAGAAAAAA 
AAGAAAAACCAAATGAGAAAAGTGCTGGTGCTAAATAAAACATTATTGAGGGGCC 
AAGAGGGGACAAAAGAAGATATAACTAGATCATTAAGTTTTCGCTCTAGTAACAG 
GAACAAAGATTGTGAGATACACTGTTATGCTAAGAGACGGTGCGATATTCTGTAC 
GAAAATTATTTAACTATTAACTAAATGTATACCACTTCACGTGCCACCGAGTAGG 
TTT CTAAAATGTG CAAC CATTTTAGGT ATGTG CGCAGCTCTTT ATT CTAAACGGG 

AGTCACTACATTACTATTATCGTGTTTTTGCCCATGTACTTTCTTATAATCTTAA 
GACAACAACGGGATGATAGGCGCATTCGGACTTTCATTGATGCAAATGTGTGAAA 
AATGCATCCAAAAGACAACTTTTGTACAGAATACAATTGCAAAAATACTTTACGG 
GCATAGATCGGTAAGGTCACCGGGAAGCTAGCGTAAGAGACCTTATTCGGAACCG 
AGCAACCATTTCCGAATGTAGTAGTAGTTGAAGGAGTAAATCGACCTTATTGTAC 
ACTACTTCCTTTAAATTTGATTTCTGGCCCCGCGCAATTTCTTGGCGGTTAAGCT 
GTATTTTTACCTCATCGGGAAAAGTTATTGCAAGTTAAAGGGGATCAAACGATTA 
G CAAACT AATT AT AG AT CAAAGG C CG AGGG CT T T C T AAATTTGGCAT AT TT CG C C 

GTCGACTGAAATAGAAGGGATAAATCATGCATCTCCAGGATTATCCCTACTCCAT 

TCATTACAACATGCGCCAAATCAAGCCTATATAAGATTCTCGTCATTTAGCATGC 

TCTATTGATTTGTGTCTTGTTTTGTCTAACACTGAAACTGTAACCTAAGATTTCT 

TTAGATAATTATTACATTTACATGAA.TAAGAAATCTCATAAAACAAGTACTGTTT 

ATAAGTAAAAATGCAATATAAAAAGCCATTAGTCGTCTCCGOTTTAGCTGCTACA 

TCTTTAGCTGCCTATGCTCCAAAGGACCCGTGGTCCACTTTAACTCCATCAGCTA 

CTTACAAGGGTGGTATAACAGATTACrCTTCGAGTTTCGGTATTGCTATTGAAGC 

CGTGGCTACCAGTGCTTCCTCCGTCGCCTCATCTAAAGCAAAGAGAGCCGCCTCT 

CAGATAGGTGATGGTCAAGTACAGGCTGCCACTACTACTGCTGCTGTTTCTAAGA 

AATCCACCGCTGCTGCTGTTTCTCAAATAACTGACGGTCAAGTTCAAGCro 

GTCTACTGCCX3CrroClX3TTTCCCAAATAACTGACX^TCAAGTTCAAGCTGCT^ 

TCTACTGCCGCTGCCGTTTCTCAAATAACK^CGGT 

CTACK3CCGCIX3CCX3TTTCrrCAAATAACTGATGGT 

TACTGCTGCCGCTGCCrCrrCAGATTTCTC 

ACT AAGGCTGCTG CAT CCCAAATT ACAGATGGGCAGATACAAGCAT CTAAAACT A 

CCAGTGGCX3CrrAGTCAAGTAAGTGATGGCCAAGTC(^GGCTA 

AGACG CT AACGAT C CAGT CGATGT TGTTT C CTGT AAT AACAATAGT AC CTTGT CA 

ATGAGTTTAAGCAAGGGTATCTITAACCGATAGGAAGGGTAGAATTGGCrCTA 

TTGCCAACAGACAGTTCCAATTCX^TGGTCCTCCACCACAAGCTGGTGCTATCT 

TGCTGCIX^TTGGTCCATCACCCGA3AAGGTAACT^ 

ACTTTTTACCAATGTTTGTC!TGGTGACTTC!TATAACTTGTATGATAAGCACATTG 
GTT CT CAGTG CCATGAAGTTT ATTTG CAAG CT AT AGATTTAATTGACTGTTGAAC 
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GATGCATCGATCAATCGGAGTCGTCCTCCTTTAACTTCACGAATTAGTTGCCACT 
CTCATTCCCCACACATAAACTTGTTTTATGGCATCCTTTTCATTTAGCATGTCTT 
TATTTCCAAACCITTCCTCX3TTCTTTGCATTCATTTAGCGTTTGCTCX3AGAAAGC 
ATCACGTTTTCAC^CATTATCGTTCGTCGCTATAATAAAAATAGTTATAGAATTT 
ACTCAGATTTACATGTCGTACCTTTTTAATTGTAAAAAAAAAAATTTTATGATAC 
ATAATTACCTAAATATAATTCAGAATCAAACATACTTATAGCTATTTGTATGCTA 
TTAGGTGGTCCTGCTATAAAAATATCGTTTA 

ACTTAGTCGC^U^TTGCAGAAGCTTTCCCTGAGAAAAAATTTGTGAAGCTAGCTGC 
GAT AG CAAAGGAG CG CTTAAGGT AT AGAAAAGCACT CAGCTGGAATG CCAAAAGA 
TAGTTTAGCAACTGACCAAGGAAAAAGCTTGTAGGTAGACTTAACTTCATTGTTC 
TCTAATCCTTTCGTCGTGTATATTGTAAAAACTGCTGAACGAGTATTGATAAAAG 
ATATCTTGGCCACTAAGGGGCAGATCCCCTTCTGGTGTGATAGACAACCCCAGGA 
GCATAGATAACACCAACrTGTGGTGGAGGGTCATCGAATTGGAATTGTCTGTTGG 
CAACAGTAGAACAGATGCTGCCCTTGCTATCTGTCAAAATGCCGCTCTTCAAAGT 
TACTTTCAAAGCGCTGTCACTGCTACGACAGCTTTTTTTTTAGAAACAGCAGCAA 
TGCCTTGACATGTAACGTAAGAAAAGAAAAAAGAGATGGCAGAAGAAATACTAAG 
CGATAACGGCAATGTAGAGGTGCTTTTTTTATCX3GAATAAATAGAGAAGTCAGTA 
ACAGTGATTGCTGTGGCTCCCTCTTTAATCGTATCTATGTAGGTTCCGATTAAAG 
TGGTCGTG 
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PIR3 (YKL163W) Protein Sequence. 

MQYKKPLWSALAATSLAAYAPKDPWSTLTPSATYKGGITDYSSSFGIAIEAVAT 
S AS S VASS KAKRAASQIGDGQVQAATTTAAVS KKS TAAAVSQ I TDGQ VQAAKSTA 
AAVSQITDGQVQAAKSTAAAVSQITDGQVQAAKSTAAAVSQITDGQVQAAKSTAA 
AASQISIX^VQATTSTKAAASQITIDGQIQASKTTSGASQVSDGQVQATAEVKDAN 
DPVDWSCNNNSTLSMSLSKGI LTDRKGRI GS I VANRQFQFDGPP PQAGAI YAAG 
WS I TPEGNLALGDQDTF YQCLSGDF YNLYDKH I GS QCHE VYLQAI DL I DC 
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YPK2 (YMR104Q DNA Sequence 

(including 1000 bp upstream and 1000 bp downstream of start and stop codons, respectively): 

CCAACCACGTAAGGGAT^AAGGACGGTGTTTGGGCCATTATGGCGTGGTTGAACAT 
CTTGGCCATTTAC^^CAAGCATCATCCGGAG^^ 
C^GAATGAATTCTGGGCAAAGTACGGCCGTACTTTCTTCACTCGT^ 
AAAAAGTTGAAACAGAAAAAGCTAACAAGATTGTCGATC^ 

TACCAAATCX3GGTGTTGTTAATTCCGCCTTCCCAGCCGATGAGTCTCTTAAGGTC 

ACCGATTGTGGTGATTTTTCATACACAGATTTGGACGGTTCTGTTTCTGACCATC 

AAGGTTTATATGTC^^GCTTTCCAATGGTGCAAGATTCGTTCTAAGATTGTCAGG 

TACAGGTTCTTCAGGTGCTACCATTAGATTGTACATTGAA 

AAATCACAATACCAAAAGACAGCTGAAGAATACT 

TCATCAAGTTCTTGAACTTTAAACAAGTTTTAGGAACTGAAGA 

TACTTAAAACGAATGATTTACTAATGGCT 

TATTAACGGTAAAGAAGAAAATTTCAATTTTTTGAACACIATACTTTATATACTTA 

ATAGATCCATATTTCX5ACATATTAGCAAACGATTGCATAGGTTTCTGAGTCT 

TTTTTTTTTTTTCATAAGGAGGAGAATATTTTGGTTAATCGCAGTATCTTCTTCA 

TAAGTGCTGTTTCTAATTATATCTAATTCACGAATTTTTCCCAAATTAGCGTATC 

CCCGAATTCAGATTACCTACCCCGAGTTTTTTATTATATTTCCCTCGAGAAATCT 

GTAAAATGGCCGTCATCCTTAGATTTATAAATAAAATGATAAAATTCAGCCi\AAG 

TGCTCCTAAACCAGAATTGTTCAACTGGGTGZU^TTATCGCGTATACAAATATAC 

ATATAGTAACATGCATTCCTGGCGAATATCCAAGTTTAAGTTAGGAAGGTCCAAA 

GAAGATGATGGGAGTAGTGAAGATGAAAATGAAAAATCGTGGGGTAATGGCCTGT 

TTCATTTCCACC^TGGAGAAAAACATCACGATGGTAGCCC 

TGAAC^CXSAACACCATATAAGAAAGATCAATACAAATGAGACT 

TTAAGTTCTCCAAAATTACX3TAATGATGCATCCTTCAAGAATCCATCGG 

GAAATGACAATTCTAAGGCITCCGAAAGGAAAGCT 

GCAGGGACCGAGTTCGGAATCCGGACTAATGACAGTGAAGG 

GATTTTACTCITCCCrTCCCTATCACCT 

TAAGTTCCX^C^TCCITACTT 

AATGCGGCAGCTACCACX3ATACAAGAGAGTGGAT 

TTGATAGATAGAGCTTTTGCCACTAAATTGA.TTCCT 

GGTCAACAAATTCAAGCCCATTACTTTATT^^ 

TACTACTATTAGTCCAGATATGGGAACGA 

TCX3ACATTTGATGTAACAAGAAAATTACX3ATTT^ 

GGATTCC^TCCCTACraTTACCCrCTAAAAACT 

GGACGAAGTACTGAAGGAGATTTTAAAAAAAATCAATACAAAT^ 

TTGGACTCCTTCCATTTACCTTT^ 

GACTATACAATCACCATTGGATTTCTTTAGAAAGGGG^ 
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CACGGTGGACTACAAACCTTCTAAGAACAAGCCTCTCTCCATTGATGACTTTGAT 

CTATTGAAGGTTATCGGGAAGGGTTCGTTCGGCAAAGTGATGCAAGTAAGGAAAA 

AAGATACCCAAAAGATTTACGCTTTGAAGGCTCTGAGAAAAGCATATATTGTATC 

GAAATGTGAAGTGACACATACTTTAGCGGAGAGGACTGTCCTAGCAAGAGTTGAC 

TGCCCCTTTATTGTTCCGTTGAAGTTCTCATTCCAATCTCCGGAGAAGTTGTACC 

TAGTATTAGCTTTCATTAATGGCGGTGAACTGTTCTACCATTTACAACACGAGGG 

ACGATTCAGTCTAGCACGCTCCCGTTTTTATATTGCAGAACTATTATGTGCTCTC 

GATTCATTACACAAACTTGACGTCATTTATCGTGACCTAAAGCCTGAAAACATTC 

TATTGGATT ACGAAGG ACATATTG CACTGTGTGATTTTGGG CTTTG CAAG CTGAA 

CATGAAGGAT AATGACAAAACAGACACTTT CTGTGGT ACT C CCGAAT ATTTGGCA 

CCAGAAATCITGTTGGGGCAGGGCTATACTAAAACAGTTGACTGGTGGACATTAG 

GTATCTTACTGTATGAGATGATGACAGGGCTGCCACCATACTATGATGAGAACGT 

TCCTGTTATGTACAAGAAAATTCTGCAGCAACCGCTACTATTTCCTGATGGATTT 

GACCCTGCGGCAAAAGACCTATTAATTGGCCTCTTAAGCAGAGACCCAAGCAGAA 

GACTCGGCGTTAACGGTACAGATGAAATTCGTAACCATCCTTTCTTTAAAGACAT 

CTCATGGAAAAAGCTACTTTTGAAGGGCTATATTCCGCCTTACAAGCCAATTGTA 

AAGAGTGAAATAGATACTGCAAATTTTGATCAAGAGTTCACTAAGGAAAAACCGA 

TCGATAGTGTAGTGGACGAGTACTTAAGTGCAAGTATTCAAAAGCAGTTTGGTGG 

GTGGACGTACATTGGTGACGAACAGTTGGGTGATTCTCCTTCGCAGGGGAGAAGC 

ATTAGTTAGAAGCAAGCCGAAGCAAGCCGAGCCX3AGCCGGACGGAATTTATAGCT 

ATAGCCGCAAGAGGTTGGAATTTTCAAAAATGGATAGTTCAAGTAGATTGCGATA 

CGCACTCCX3TTAOTATTGTGGTTAACGGGGACAAGAAGAACTACAGAAAATAGAA 

TGGTCCGCAGAGGCTGCX3CTCTTCTTTTAGCAACTCTCACACX3ACTTATGTTGCT 

TATTCATTT CTTTT ACAGCATT AT CAGAATT CTT CCATCTACGGAATTGAGAT CA 

AAGACCGACCTGTTGTCGGCCX3AAGGACGGACGCTTATACCCGCGGATGTCAAAG 

GGAAGCCCGCGGGGCGCAAGTCGAGGTTACCGGAATTCGCCAAACGGCAAAGGAC 
CCTTGCATTGCCIXaAAACKSAAAGATTCGClTTT 

CATAGTCTGGGCCX^3GAGCAGCTTATTTCTTCCX3CX3GATGA 

GCGCX5GGCTCAGCCATGGGGAGCCTTACCTAGTCCCGTAAAGGGAAAAAGCTAAC 

CTCATTCGCCTCACAGGGTGAAAGCGTGAACAAAAAAAAAAGAA^ 

TTAAAATTTACAGTATATATATATTTGTATTTACGTATTAAACTATATATAAATA 

GATATGTATGCCGAAAAAGTAAAGTCTGGGTGATGCCTAGTCCAATCTTTCTTAC 

TAC!TGTCCAGTTTCTATCGTAGCAGTTAATTATACATAGAACTGTGTAAATTCAA 

CGCATTAATTTTTTTTTTTTT CACITTCGCAGTTAGGGGGGACACATTTTTTTTG 

CCCTTTCTTAAGCTTCGTAAGCGAGTTACATGATTATTTCTTCCTGGGATACAAT 

ACGCX3TTCGTACAAGTCACAGCTGGACCGTATAGGGAACAAGACIGCAACTCTCT 

CCAACTTGTTAAACAGAGGAGGAAAAGAAAGAGGGAAAAGAGGAACAAAGACAAT 

CAAAGAAAAAGAATAGAAA 
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YPK2 (YMR104Q Protein Sequence. 

MHSWRISKFKLGRSKEDIX3SSEDENEKSWGNGLFHFHHGEKHHDGSPKNHNHEHE 
HHIRKINTNETLPSSLSSPKLRNDASFKNPSGIGNDNSKASERKASQSSTETQGP 
SSESGLMTVKVYSGKDFTLPFPITSNSTILQKIiLSSGILTSSSNDASEVAAIMRQ 
LPRYKRVDQDSAGEGLIDRAFATKFIPSSILLPGSTNSSPLLYFTIEFDNSITTI 
S PDMGTMEQ P VFNKI STFDVTRKLRFLKIDVFARI PS LLL PS KNWQQE I GEQDEV 

LKEILKKINTNQDIHU3SFHLPLNLKIDSAAQIRLYNHHWISLERGYGKIjNITVD 
YKPSKNKPLSIDDFDLLKVIGKGSFGKVMQVRKKDTQKIY7\1iKALRKAYIVSKCE 
VTHTLAERTVLARVDCPFIVPLKFSFQSPEKLYLVIAFINGGELFYHLQHEGRFS 
LARSRFYIAELLCALDSLHKLDVIYRDLKPENILLDYQGHIALCDFGLCKLNMKD 
NDKTDTFCGT PE YLAPE I LLGQGYTKTVDWWTLG I LL YEMMTGLP P YYDENVPVM 
YKKI LQQPLLF PDGFDPAAKDLL I GLLSRDPSRRLGVNGTDE I RNHP FFKD I SWK 

KLLLKGYIPPYKPIVKSEIDTANFDQEFTKEKPIDSWDEYLSASIQKQFGGWTY 
IGDEQLGDS PSQGRS I S 
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YLR194C DNA Sequence. 

(including 1000 bp upstream and 1000 bp downstream of start and stop codons, respectively): 

GGATATGATTGCTGAGAATGCGTTACCGGCCAAAACAAAGACAGCGGGATTGAGA 

AAATTAAAGAAGGAAGATATTGACCAAGTTTTTGAGTTGTTCAAAAGATATCAAT 

C CAGGTTCGAACTAATT CAAATTTT CACAAAAG AAGAATT CG AACAT AATTTCAT 

TGGTGAAGAATCGTTACCATTGGATAAACAAGTAATTTTCTCATATGTAGTCGAA 

CAGCCCGATGGAAAAATTACAGACTTCTTCTCATTTTACTCATTGCCATTCACAA 

TCCTAAATAACACAAAATATAAGGACCTAGGCATCGGGTACTTGTATTATTATGC 

CACCGATGCAGATTTCCS^TTCAAAGACAGGTTTGATCCAAAAGCTACTAAGGCT 

TTGAAAACAAGATTGTGTGAATTGATTTATGACGCTTGTATTTTGGCCAAAAACG 

CTAATATGGATGTTTTT AACG CGTTGACTT CGCAAG AT AAT ACATTGTT CTTGGA 

TGATTTGAAGTTCGGGCCCGGTGACGGGTTCTTGAACTTCTATTTATTTAATTAT 

AGAGCAAAGCCGATTACCGGTGGCTTGAATCCCGACAATAGTAACGACATTAAAA 

GGCGTAGCAATGTCGGTGTTGTTATGTTGTAGTGGCTGAAAGGACGAGGCGTATA 

TAGTTTTCGTGTACATAGCCGACAGAATTTGACCACATTTAGTTTTTCCGCATAG 

TCAATTGACGAAGTGAAAAAATAATTAATCCAATGGCTGGCTTTAGAGTGTCAGC 

CTCCAAAATAAATCCAAAAATAGACAAAGAGAATCACTATAATTACCGCCTTGGA 

GTCCAAGTTGGCTTGAGAACTCGCATTTATTTTTAGCGACTGAGGTAGCTGAAAA 

ACGCCTACTTTCTCAGSAAGGCGGTAGTGAGCATATATAAGTATGTAAGAAAGATC 

AACT CTT CTGGACT AG AT ACT CAC CG AT CT AGT G AAAAT ATAAACAAAC C CAACA 

TATATATAAAATGAAGGCCTGTTCCATATTATTTACCACCTTAATTACTCTAGCC 

GCIGCTCAAAAAGACrrCTGGTTCCITAGATGGCCAGAACTCT 

AAAAGGAAAGCTCAAACTCTCAAGAGATCACACCn'ACCACGACAAAGGAAGCCCA 

AGAAAGCGCATCAACTGTAGTTTCrACCGGAAAAAGCrTAGTACAAACTAGCAAC 

GTCX5TCAGCAACACCTATGCTGTGGCTCCAAGTACCACCGTAGTGACX3ACGGATG 

CACAAGGCAAAACCACGACACAGTACCTATGGTGGGTGGCCGAAAGCAACTC^^ 

CGTAAGCACAACTTCAAC^TGCCTCTGTGCAGCCCACCGGAGAG 

ATCACCZACTCCGCATCCTCCTCAACGACATCAACATCAACGGACGGGCCAGTTA 

CTATAGTAACTACCACXSAATTCGTTAGGTGAGAtOT 

GCTACCGTCCTCAGCCACAACTGACAACACGGCTTCATCAAGTAAATCATCTTCG 
GGATCCTCATCAAAACCGGAATCAAGCACCAAGGTAGTAAGCACTATCAAATCAA 
CTTATACCACTACGTCAGGTTCTACAGTAGAGACACTGACCACTACATACAAGTC 
TACAGTCAACX^TAAGGTAGCGTCCGTAATGTCC^^TTCrACCAATGGCGCCTTT 
GCOGGCACTCACATAGCTTATGGTGCX3GGTGCATTCX3C 

TATAGAATGTATAATCAGTTCTX3TATACCACCACATAGTTCTGCATTTTAATAAA 
ACTCTTTCTTTTTATACACTGTAGGTAACCAATAATATAACTATTGTTATCATCG 

CCTGGATAATAAGCTAGAAAAAAAAAATATATATGACAGATGGATGAGTAATCAT 
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ATTCAATAAGTATTGTCTGGCTTCTGAGACGGCGGTAAGATATCCTTAAGAGTTG 
CAATGGTCCTTTTACACAAAAGCAC^ 

TTCACGCGCCTTTTTTAATAGATACCCCAATCCATACTCCCCCCATGTACTATCC 
ATAGAC^CAATATCAAGGAACGTTGATCAAGAA<^^ 

TGTTGAAAAAGTCCGGAAAGCTGCCCACATGGGTCAAACCCTTTTTAAGAGGTAT 
AACAGAAACATGGATAATCGAAGTTTCCGTAGTGAACCCCGCTAACTCCACAATG 
AAAACTTACACTAGGAATCTGGATCACACTG 

CTACCTATCAATTTGACAGTGCTACAAGTAGTACGATAGCAGACAG 
GTTTTCAAGTGGCTTCAATATGGGTAT 

ACTAT^ATTTGACGAAAACGTTAAGAAAAGCAGAATGGGCATGGCATTTGTTATCC 
AAAAACTCGAAGAGGCGAGAAATCCTCAGTTTTGATGTTCCCATTTAAAGATCTT 
TAAAGATATCACCATGGG03AGCGAAATTGAGAAAACTAGTG(^GCT 
GTCACGTCCTAAAAATTGTAAATAAGCGCT 

CATATATAATTATTTATTTATAAC^GTC^TTCTGCTAAACTATACATCAAATGTC 
ACTAATCTTGATATT 
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YLR194C Protein Sequence. 

MKACSILFTTLITLAAAQKDSGSLDGQNSEDSSQKESSNSQEITPTTTKEAQESA 
STWSTGKSLVQTSNWSNTYAVAPSTTVVTTDAQGKTTTQYLWWVAESNSAVST 
TSTASVQPTGETSSGITNSASSSTTSTSTIXSPVTIVTTTNSLGETYTSTVWWIiPS 
S ATTDNTAS S SKS S SGS S SKPES STKWST I KSTYTTTSGSTVETLTTT YKSTVN 
GKVASVMSNSTNGAFAGTHIAYGAGAFAVGALLL 
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PST1 (YDR055W) DNA Sequence 

(including 1000 bp upstream and 1000 bp downstream of start and stop codons, respectively): 

TCTTGTTCTTTAC^GATTCAAGAGGAAACCAAAAAAAAATCAAAGAAAAAGAATC 

GAATTTTTCCCAAAATGAAAGTGTAAGGAAAAAAAAAGAGGAGATAGAAAATCCG 

AAGAACCCCAAGGGACGGACAAACACAAGACGATGCTGCACGTGGTTAGTTTTGT 

AAGCGCAGGTTACGATAAAGAGCATAAACAAATCATTACTAAGAGCGGTATACAA 

GAATAAAGTGACAAACAGTTCTCCCTATTTAACGCACTTAACGTAGGTTCCATCA 

TTATGATGCTATTGCCACATCAAATCTCCTTTGGACTGAACCCGCATTAGTAATT 

GCCCXSCTTTTCTTTTCTTCCGCGGGTGGGCCCCATAAATAGAAAAAAAAAGAAAG 

AAAGCGTTTAAATAAATAGAGTGAGCGGATTTCTATTATCTGAAAACCGGGTTAT 

AATGCAOTTGATATGCACGTGGGAGCrGGGaSGCrATTTTTTTCT 

TATTTGAGTCGTTTAAAATAGCACTCCCCGTTGACCCGCCAACTCATTTTTGTTT 

TCTCTTTACGGAAAAGGCTTTAAATTAAGGCCCGCATTTTCGGTATCCTTGAGGG 

AAAAAAACCAAAGAAACCCAAAAAAGACCACAAAGCTGGGATATCTTAATTAGTA 

GAGAGGGCTTTTAGTTTTAATAGTGTTACGAGTCTCTAAAAATAGCGTAGGCACA 

CTG C C CTGATT CGGACTTTGAT CAGAGTTT ATT ACT ACAAAGAGTAATGTTGAAT 

GATTGGGCTGGGTTTTCATAGCATTAACTCTAAGTAATATCATTCAACCGCTCAA 

GGTTCCTTACGAGCAAACCCATATATGCTCTACAGATAAACATATAAATAGCGTG 

CATATTCJTTCrrCTATTCAACTCTTGCTCTGTATAGTTCAATAGAATCTTACAGTA 

CATC1ACX3CTGCAATAGATCTAATCCAAGAGAGAAGCAAAAAAAAAAAGCTCGCTA 

TAAAAATATCATGCAATTACATTCACTTATCGCTTCAACrGCGCTCTTAATAACG 

TCAGCTTTGGCTGCTACTTCCTCTTCTTCCAGCATACCCTCITCCTGTACCATAA 

GCTCACATGCCACGGCCACAGCTCAGAGTGACTTAGATAAATATAGCCGCTGTGA 

TACGTTAGTCGGGAACTTAACTATTGGTGGTGGTTTGAA 

AATGTTAAAGAAATC^CGGGTCTCTAACTATATTTAACGCTACAAATCTAACCT 

CATTCX3CTGCTGATTCCTTGGAGTCCATCACAGATTCrT^ 

GACAATCTTGACTTCIK3CTTCATTTGG 

CTGATTACTCTACCCXKX^TCTCCAGTTTTACTTCAAATATCAAATCTGCT 
ACATTTATATTTCCGACACTTCGTTACAATCTGTCGATGGATTCTCA 

AAAAGTTAACGTGTTCAAOTTC^TAACAATAAGAAATTAACCTCGATCAAATCT 

CCAGTTGAAACAGTC^GCGATTCTTTACAATTTTCGTTCA^ 

AAATCACCTTCGATGACTTGGTTTGGGCAAACAA 

CTCTGTTTCCTTCGCTAACTTGCAAAAGATTAACT 

AACTCC ATC TCAAGTTTGAATTTCACTAAGCTAAACACCATTGG 

GTATOTTTTCCAATGACTACTTGAAGAACTTGTCGTTCTCTAATTTGTCAACCAT 
AGGTGGTGCTCTTGTCGTTGCTAACAACACTGGTTTACA^ 

GACAACCTAACAACCATTGGCX^TACITTGGAAGTTGTTGGTAACTTCAC 
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TGAACCTAGACTCTTTGAAGTCTGTCA 

AAGCAATTTCTCCTGTAATGCTTTGAAAGCTTTGCAAAAGAAAGGGGGTA^ 
GGTGAATCTTTTGTCTGCAAAAAT^ 

CCACTTCCAAATCTCAATCAAGCCAAACTACTGCCAAGGTTT CCAAGTCATCTTC 

TAAGGCCX3AGGAAAAGAAGTTCACTTCTGGCX3ATATCAAGGCT 

TCTAGTGTTTCTAGTTCTGGCX3 

ATGCCGCTATCATGGCACCAATTGGCCAAACAACC 

GGCAATCATCATGTCTATAATGT^ 

AACTAGTACCTGTCATTCACGAG^^ 

TTTATGTATTCAAATATTTTCGGGAAAGAGATAAAAGTAACGACACTTAAAA 

TAAAAAATCACAATACTTTATTTACT 

TGTTGTTGCTTCTTTGCTGAGC^ 

GTAACATTTTCCAGTTCTTTTGAAACCAAGACACOTCCTTAACOT 



GTTTTGGTTTTCTTCTTTAATTTGGTGACTGGTGCAGTAGGTCCCGCTTCTGGAT 

ATCTGACTGTAGCAGTAATAGCATCGTTAGT 

TTTGACCTCATTATCTTCATC^ 

CTCAGCTTCATGTAGCTAAAACATGGCATATCCAGCTTAC 

TCA7\ACAGTATTCTCCAGAAACTTCAACATCCTGTATAT 

AACATTCCCATCXXSATGTAC^^ 

TTOSCATC^TCrGAATAGCTTAATTGTAAAATATCAGCA^ 
CCAATAAAATCACACGCAACAGCCGCACAAGCAT 

ACAGTTCTTGTTAGATAGTGTTGAATAGTACCACCTTGTTTTTTTACTCAAAGTG 
TCI w i w ITATATACTTCrAATTATTCCTATATTTGGTTGGGTTTTTAA 
GCAAATACAGTGGTTAGAGACCCAGCX3 
TTCAAGTAAAAATGTAGCTTCATAAAAAAGAAGCA 
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PST1 (YDR055W) Protein Sequence. 

MQLHSLIASTALLITSALAATSSSSS I PSSCTISSHATATAQSDLDKYSRCDTLV 
GNLTIGGSLKTGAIANVKEINGSLTIFNATNLTSFAADSLESITDSLNLQSLTIL 
TSASFGSLQSVDSIKLITLPAISSFTSNIKSANNIYISDTSLQSVDGFSALKKVN 
VFNVNNNKKLTS I KS PVETVSDSLQFS FNGNQTKITFDDLVWANNI SLTDVHSVS 

FANLQKINSSLGFINNSISSLNFTKLNTIGQTFSIVSNDYLKNLSFSNLSTIGGA 
L WANNTGLQKI GGLDNLTT I GGTL EWGNFTS LNLD S LKS VKGGAD VE S KS SNF 
S CNALKALQKKGG I KGES FVCKNGAS STS VKLSSTS KSQS SQTTAKVS KS SSKAE 
E KKFTSGD I KAAAS AS SVS S SGAS S S S S KS S KGNAA I MAP I GQTT PL VGLLTAI I 
MSIM 
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KSS1 (YGR040W) DNA Sequence (including 1000 bp upstream and 1000 bp 
downstream of start and stop codons, respectively) : 

TTGGGATTCCATTTTTTATAAGGCGATAATATTAGGTATGTAGATATACrrAGAAGTTCTC 

CTCGAGGATTTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATAAATATT 

ATTATCATCATTTTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTT 

TCAGCTTCCACTAATTTAGATGACTATTTTTCATCATTTGCGTC^TCTTCTAACACCGTA 

TATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCA 

ACAGTATTTATGTTTTGTCATTCTTTTCTACATAATCTTGAAACTAGGTAGATCTACAAT 

TGAAAAGTAAATACTAACATTATTTACTAAATTTAAGTTAGAAATCGGCACGAAAAAAAT 

TTGACAGATTACGAGAGTCCAGCCAAAATATGAGTATATTACTATTTCCCCTTGGTGAAA 

GAAATGAAAGATGTTATTTTTTACCGGCTTAGTAATACTGAGCTACTTACTTGGGGGAAA 

GAAAGATTGGCTACTTATTATGTATGAAGCCTCAGATTACCTTGAATTCCTGAACCG 

GAGCAGTATGCTCTTCAAATTCGAACTTTTTGAACATCTTTCCTCCACATTCC 

TTCACATTCAAAACGCGCTGTGAAGCTGTTAGAAATTTACAGATCGAGGCATATTTC^ 

ATATAATGTATTTTTATTAAGACACCCAAAGTACTTCCAATCTGTAGATATTGCACTTTA 

TCTGACCAGAAGCCAGACTTGAACAGTTACATATTGTGCITTGCAGTCGTTAAATTTCCC 

GAACTGTTTTCGTATTTTTTTTTCTTTTCCTCTTTTCCACTGG CCG A 

CTAAAAATTTGGCAATTT AAAG/VAAG CATCTTTTAAAG ATAGAAAAGGTTATTTCAACAA 

AAAAGTAT CTTTTCTTCACTTTTCTTTCAACAATTCAAAG ATGG CT AG AAC C AT AACTTT 

TGATATCCCTTCCCAATATAAACTCGTAGATTTAATAGGTGAGGGAGCGTACGGAACAGT 

ATGTTCAGCAATTCATAAGCCTTCCGGCATAAAGGTAGCTATCAAGAAAATACAACCGTT 

TAGCAAAAAATTGTTTGTTACAAGAACTATACGTGAGATCAAGCTTTTACGGTATTTCCA 

TGAACACGAAAACATAATAAGTATATTGGATAAAGTAAGGCCAGTATCCATAGACAAACT 

AAACGCTGTTTATTTAGTCGAAGAGTTGATGGAAACCGATTTACAAAAAGTAATTAATAA 

TCAGAATAGCGGGTTTTCCACTTTAAGTGATGACCATGTTCAATACTTTACATACCAAAT 

CCTCAGAGCCTTAAAGTCTATTCACAGTGCACAAGTTATCCATAGAGACATAAAGCCATC 

AAACCTCTTACTAAATTCCAATTGTGATCTCAAAGTCTGCGATTTTGGACTAGCGAGGTG 

T1TAGCTAGCAGTAGCGATTCAAGAGAAACATTGGTAGGATTCATGACGGAGTACGTCGC 

AACGCGATGGTACAGGGCACCCGAGATAATGCTAACTTTTCAAGAGTACACAACTGCGAT 

GGATATATGGTCATGCGGATGCATTTTGGCTGAAATGGTCTCCGGGAAGCCTTTGTTCCC 

AGGCAGAGACTATCATCATC7UVTTATGGCTAATTCTAGAAGTCTTGGGAACTCCATCTTT 

CGAAGACTTTAATCAGATCAAATCCAAGAGGGCTAAAGAGTATATAGCAAACTTACCTAT 

GAGGCCACCCTTGCCATGGGAGACCGTCTGGTCAAAGACCGATCTGAATCCAGATATGAT 

AGATTTACTAG ACAAAATGCTTCT^TTCAATCCTG AC AAAAGAATAAGCG CAG CAGAAGC 

TTTAAGACACCCTTACCTGGCAATGTACCATGACCCAAGTGATGAGCCGGAATATCCTCC 

ACTTAATTTGGATGATGAATTTTGGAAACTGGATAACAAGATAATGCGTCCGGAAGAGGA 

GGAAGAAGTGCCCATAGAAATGCTCAAAGACATGCTTTACGATGAACTAATGAAGACCAT 

GGAATAGTATTCACAAGAAGATTTCTGCCATACTre 

GCAGTGACACGTTGTGGTCTGTAGGTCAATATGTAAGTAAGAAACTTC 

CACGATGCATGCCAATGGAAAAATGCAAGGAACGAAATGGCGCC^ 

TTTTTTCXSCCAGCAGAAGTACACGAAATGCGGCITCATGAGCCTCT^ 

AAACGGGAAATGCAGAGAAAAACCAGCCATCGCX3TGTGCTTGGAGAGCTGATC 

AATCAAAG AGGCGAT ATCAACAC CTTTT ATC CAGCACTATTCAACAGTGAATGGG CTCC C 

AAGTAAGTCTTGGCATTGTGCTTTCTATTCTTAAGT^ 

GGTTTGTTTATTCCTGGCTAGATGTTCGCATTCGTTTTCTAGTTGACCATA 

T ATTCACAACTAAT AC CCAG C C AAGGTAG TCT AAAAGCT AATTTCTCT AAAAGGG AGAAA 

GTTGGTGATTTTTTATCTCGCATTATTATATATGCAAGAATAGTTAAGGTATAGTTATAA 

AGTTTTATCTTAATTGCCACATACGTACATTGACACGTAGAAGGACTCCATTA T W 1 W 1 M 1 " 1 " 1T 

CATTCTAGCATACTATTATTCCTTGTAACGTCCCAGAG 

TTTCITAACGGTGACGAAGGATCACCATACAACAACTACTAAAG 

ACCTTGC^VACTATTTATCTGACATTTGCCTTACTTTTATCT 

ATTTTTCAATTTG ATTTCTAAAGCTTTTTG CITAGGC ATACCAAACCATCCACTrcATTT A 
ACACCTTATTTTTTTl^TCGAAGACAGC^^ 

TACAACAATTTCATTCTTCATCCTATGAAATGACX3AAAATAACCAGA 
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KSSl (YGR040W) Protein Sequence: 

MARTITPDIPSQYKLVDLIGEGAYGTVCSAIHKPSGIKVAIKKIQPFSKKLFVTRTIRET 
KLLRYFHEHENT ISILDKVRPVS ^DKIiNAVYLVEEIjMETDIjQKVINNONSGF^TLSnnwv 

FMTE YVATRWYRAPE IMIjTFQE YTTAMD I WSCGC I LAEMVS GKPL FPGRD YHHOLWL ILE 

^s^fnqikskrakeyianlpmrpp^ 
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PGUl (YJR153W) DNA Sequence (including 1000 bp upstream and 1000 bp 
downstream of start and stop codons, respectively) : 



AATACTGAATAGAATCACGCTACTACGACAAGACTCGGTTACTGTGCCTAAAATAATCCT 

GTGATAAACGAGTTATGTTAAACGCAGTACAGGGGTTAAAGGGCATTGAGTTTTTGTGAG 

TGGAAATGCCCCCGTTATAGCTTCCAGTTTAATTACAAATTATCAATTTAAGCAAATATA 

ACTGGAGGATTGGGGAGGCGACTAAAAATGGCTACCACGCTATTAGACATACAACATTGA 

GTATTTTATGTAATTTTGTTACTGCTAGCACGGCCATGCAATTGGCAACTGAAAGCTATC 

TGACAACTTAAATGATTCTTAAAACAATGACGACTATAATCTTCTCTAAGAAGTTTCATA 

TCCATCTTCCTCATTATTCAGTTTCTTTTTCCTCTTGAAAGTATCGTAAAGAACAACGTC 

TTCAC ATTAG CTATTAGAAGACC ATTGAACTAC CGGATATGAGTAAG AGTGATCTTGCCG 

GAGAGATAATAGCTGCACAAAGGCCAAGGATTAGATTAATGGGTGC^^ 

AATAGTTTACAGTCATTTATTCGCAATAAATCAATTTTTTT^ 

TGATAAAAAATTCTTCACTGAAGAGAGATGCTTACATTCTAATTCTTGAATAAAAGACTC 
TCTAACGCTGTGAATTCTCTTTAGCTGTAACGGAAACAGAGAGTTATTCCGTAGTCACTG 
AAin w r i TTTTTTTTGACGCTATTATTTAAAACCT^ 

CACGAGTTTCAATCCCAGAATGTACGAGTTATAATTCTCCTAGATGCATGATACTCGTGC 
ATTCGTTTAACAATCATACCAATTTCCCATTTTCGGGATATTAAACATGAACATACTTTT 
TTACTGTGAGAATGTGGTTTCACAATTATTCCATACAGGTATAAAAACGCACAGAACT 
AAACGGGAAGACTATCT'ACCGACATTGATGGAGAA 

ACTTTATTTCCACTTTGTGCGCTTTTGCGATCGCAACACCTTTGTCAAAA^ 

TACCCTAACAGGATCTTCTTTGTCTTCACTCTCAACCGTGAAAAAATGTAGCAGCATCGT 

TATTAAAGACTTAACTGTCCCAGCTGGACAGACTTTAGATTTAACTGGGTTAAGCAGTGG 

TACTACTGTTACGTTTGAAGGCACAACCACATTTCAGTACAAGGAATGGAGCGGCCCTTT 

AATTTCAATCTCAGGGTCTAAAATCAGCGTTGTTGGTGCTTCGGGACATACCATTGATGG 

TCAAGGAGCAAAATGGTGGGATGGCTTAGGTGATAGCGGTAAAGTCAAACCGAAGTTTGT 

AAAGTTGGCGTTGACGGGAACATCTAAGGTCACCGGATTGAATATTAAAAATGCTCCACA 

CCAAGTCTTCAGCATCAATAAATGTTCAGATTTAACCATCAGCGACATAACAATTGATAT 

CAGAGACGGTGATTCGGCTGGTGGTCATAATACGGATGGGTTTGATGTTGGTAGTTCTAG 

TAACX3TCITAATTCAAGGATGTACTGTTTATAATCAGGATGACTGTATTGCTGTGAATTC 

CGGTTC AACTATTAAATTT ATG AACAACTACTGCTACAATGG C C ATGG TATTT CTGTAGG 

TATCAACTCTGACAACGGGTTGAGAATAAAAACCGTAGAAGGTGCGACAGGCACAGTCAC 

TAATGTCAACTTTATCAGTAATAAAATTAGCGGCATAAAAAGTTATGGTATTGTTATCGA 

AGGCGATTATTTGAATAGTAAGACTACTGGAACTGCTACAGGTGGCGTTCCCATTTCGAA 

TTTAGTAATGAAGGATATCACCGGGAGCGTGAACTCCACAGCGAAGAGGGTTAAAATTTT 

GGTGAAAAACGCTACTAACTGGCAATGGTCTGGGGTGTCAATTACCGGTGGTTCTTCCTA 

TTCTGGATGTTCTGGAATCCCATCTGGATCTGGTGCAAGCTGTTAATCCTCTTTTAAAGT 

ACrCATATGACTATACATACCTTCTTTTCTTTTCTTTACTATTC^^ 

AAGATGCAGGAAAATATTGGTATTTGTTCGGCAATTTATG 

AGGTCTAATTATTACTGTTGATTTGTATCAAGTTGGTATC^ 

GAGATACX3CTATGCTCATCCGGATAGCAACAATGAGAGCCTAAAAGTCCTAATTGAGAAG 

GAGG AAGGTCAAT AATTGGTAAAAAAAATGGTAAATG CG ACTAAGT ACTACAATTG AAAC 
GAATGAGCGCACTTCATCTTCCTACAAAACGCTGCGGCTGAAAAAGTTACATAAAAAACC 
GTCCTCAATAGCGTTAATCCAGCGTACATGAGAAAGTAATGACAAAGTCTTCGGTAATAT 
CAGTGCATCT ACCAATATG AC AC AATTGTGAAACTTCG CTG ACTC AAATAATAGC C CTGT 
TTTTTTG AC C ATTGTTACCCATCG AG CCAGTG AG AAAAAAGCCAAAATATCTTT AAGGC C 
TTCTCCATTTTATGTTTATCGATATTGTGTTGTCTGC^ 

TTACTTTGCCTCTTGTTATAAACTAAGTCTGCCGAATTATGCAATATATAGCAAAAGCTG 
AAAATAGATGTAATTACATAATTCGCAGTTGTATATGAGTATCCTTAACTCGTACATTCC 
AGTTCATCTGTGACAAGGCACTGTTTTCCCTAATAATTATTAGGGAAACGTCCTTCAAAA 
ATCAAAATAAT*TTTAGAGAGTCTCATOU^CCTTCX3CCATAGTTCGTGATGAAAACTTTAC 
GGTACGTCAGACTTTAGATATTGAU"l^^U M l*rATTATTTCTCCCATCGTGAGTACAATTAC 
CCTAGTTCGAACTATATCTTTCATTA 
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PGU1 (YJR1S3W) Protein Sequence: 

WAENNHVINSDNGLRIKTVEGATGTVTNVNFISNKISGIKSYGIVIEGDYLNSKTTGTAT 

ggvpisnlvmioditgsvnstakrvkilvknatnwqwsgvsitggssysgcsgipSgas 
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YLR042C DNA Sequence (including 1000 bp upstream and 1000 bp downstream of 
start and stop codons , respectively) : 

TTCTCGAGCATTAGATGATTAAATCAAAATGACATAGTATT^ 
CTTTGTTTAAGAAGTGGAATACTTTTGCITGAGTTGTT 

GTCTTAACAAATATTTTCAAGACCGGTAAGCCGAAGATGAAAAATCATTATTAACTCATT 

TTTTG AACAAAAAT AT AAACAAAAG AAAGG CAACG CACAATTTT AG AG AT ACAT AAAACG 

CAGTGGATGTTAAAAATAACAGCGGTACAGAAACGCCTGTCTCGCTCCAATAATAATTAT 

ACAAATTTGAAACCGAACGCAATGTGCCAAGAAATGTAAACACACTATAGAAAAAAATAG 

AACGGTGCACATTGTGCTAGCATATCTGCTTGGTTCrcAACAAG 

TCTCCTAGCCCAATTCTTGCCAAGTTTTCAACCTCAATCTT^ 

TATGAGGGGTCAAAATTTAGTGGAGGCCGCTTACAATCCTTCTATTTCCTCTGG 

TT AGCCGTCTGGC C AGAC CTAAGCGTCATAATCTGG AG AATTTCATTGCATG CG AG AATA 

TGAT AAGT AAG AACTTGTTTATTT AT ACAAGTTCCACC CACTCAT ACACGG CTACAATTA 

TCACGTATAATAACGTTTCGTCTAGCCC^CCTTTTTTACTTTTC 

GAGGATTTGGCCAAGAATGCCCCGAACAGCGGAAAAAATGGCGTCGCAGTTTCAGATGTA 
TAGACTCATCTTGTAGAAAAAAGAATGCAAGAATGAAGTCTTTTCGTGGTGTTTTGAAAA 
CACTATAAACAAACCGTCAACAAACATTTTGTATAAATATTTAGCTATATATTGAATATC 
TTGACCAGTAAAGCACCTTGAGAAATTGTAAGCTTGAAGAACGTACTTTGATATCCCTCC 
GTTTCATCATCCTATAGCTCGTCAACAAATCAAAAAAAATATGAAGATCAGTCAATTTGG 
CTCTTTAGCTTTCGCCCCT^TTGTGCTACTACAACTGTTCATTGTTCAAGCGCAACTTCT 
CACAGATTCAAATGCTCAGGATTTGAATACTGCCCTTGGACAGAAAGTGCAATACACCTT 
TCTTGACACTGGAAATTCTAACGATCAACTACTTCATCTTCCA^ 

CATTATTACTGGTTCATTAGCTGCTGCTAATTTCACCGGTTCTTCATCATCGTCX3TCTAT 

ACCAAAAGTCACTTCCAGCGTCATAACATCTATAAATTACCAATCCTCAAATTCTACGGT 

AGTCACCCAGTTCACGCCATTGCCTTCTTCGTCGAGAAATGAAACAAAAAGCTCTCAAAC 

AACTAATACTATAAGTTCAAGTACAAGCACAGGAGGTGTAGGTTCAGTCAAGCCATGTCT 

TTACTTCGTTTTAATGTTAGAAACAATCGCTTATTTGTTTTCTTAAACA 

TTCAAGGTCTTCGCAGGTGTAAGAAAACCCGTGGTCTCCATATTCTTAAGTATGATAAAT 

AAAAAAAAACTTAAT AAATT ATTAATTG CTTCAAACCTTTTTCTTTTTTT AGTTTTT A 

ATTTCAAACGTTATCTTCATTG AACX3 CCCAAATAGGGAAAAATCCTGGCAAATTTTTTAT 

TGCTGTCATCCAAGGCTATGCTAGAAAATTCAAGAGC^ 

TTTCTAATCAAGTGATTTCGATATCCAGTTACGAACCAT^ 

TGCGTATCAATGATATTTG CTC CTTC^l^rCX^CCTCATT AAAAATATTCTCCTCaGTAAGC 
TTTCTAATCAGCCACAGTTTTGCTGCCAAAACITTAAOT 

CTTTCCCAGGTCCGCAG CTGCAG ATGCAG ACATGG CATTCTTCATGG AGTTTTT AAACGAT 
TTCGACACCGCTTTTCGAGAGTATACCTCATAC^TGATC 
CAACCTGTTGCTGACTACTACTATCACATGGTTGATTT^ 
TCTGATATTGCTCAGAGTTTTCCGTTCACTCAATTCCAAACATTC^ 

TGGTATACCTTCTTTGCTAAACAAAGCCTCCGCCACCACC^TATACCTTCCCCAACACTTC 
ATAA CAGGTGAGACJVGAAGCTACC^TGACTAACrCATCTTATGCCAGCCAAAAAAACTCC 
GTTTCCAATTCTGTTCCTTTCTCGAC^^ 

AATGAAGAAAACAGTACAACAGC^CrTTATATCCGCATCAAACTCTTCTTCAACATCCAGA 
ACTAGTCAATCACAGAATGGTGCCCA 
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YLR042C Protein Sequence: 



MKI S QFGS LAF AP I VLLQIjF I VQAQIiLTDS NAQD LNTALGQKVQ YT F IjDTGNSNDQLLHL 
PSTTSSSIITGSIiAAANFTGSSSSSSIPKWSSVITSINYQSSNSTVVTQFTPIiPSSSRN 
ETKSSQTTNTISSSTSTGGVGSVKPCIiYFVLMIjETIAYLFS 
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SVSl (YPL163C) DNA Sequence (including 1000 bp upstream and 1000 bp 
downstream of start and stop codons, respectively) : 

ATTCTTTGGTTGTGCTTATACATAATTGAAAAAGTGCTAAAATCCTTACACTTCCAAAAC 

ATTG AAAGTGGTAATTATTTTCC ATCTAAAACCGTTGGGAGC CAC CCCAGAAAACCCTTA 

TTCTCTGCCTTCGTG AAACAG CTGCTTAT ATTC ATTGTTGGGCTGGG CGTG ATGAAGTTC 

TGCGTGTTTCTAATACTAAACTACTTAGAAGACTTGGCATACTGGTTCGCCGATCTTATC 

CTTGGCTGGTCAGATTCATGGCCAAACTTTCAAGTTTTTCTGGTCATGTTTGTCTTTCCT 

ATCTTACTGAATTGCTTCCAGTACTTTTGTGTCGACAATGTCATCAGGTTACATTCTGAG 

AGCCTAACGATAACCAATGCAGAGAATTTTGAAACGAACACATTCCTAAATGACGAAATT 

CCTGATTTATCGGAAGTCTCAAATGAAGTGCCTT^CAAGGATAACAACATTTCCAGCTAT 

GGTAGCATAATATAGTATTCCAAGGATAAGGAAAGCATGCACTGTTTATTTCCTTTCCTT 

GCTTAATTGATTTTTTTTAAAGGGAACAAACATTTTGATTTCAATTTCCACAAGCCTAGA 

CTCTTCAACACATAATCTGTGGGTTATTGTTTGGGAAAGCATTCTCCGCTAGAAGAATG^ 

AACTGGCGCTCAGGTTTGATTCTATAACTACGGCAGTTTTTCCTATTCTATTTTCGTTTT 

TTGATTTTCCCGCCGCATTGGATATTCAATTCGCGACGCTAATAATTGGCATTTCGTGTT 

CTTAAGTAATTTCGTGTTTCAAATAACCGTAAACAGAGAAAGACCCAAGAATTTCAGATG 

GCTTAGAAGAGGTAGACATTAAATCAATCTGTATGTGATGGAGAGGGAGTGTATTTAAAA 

GACGTAAGAAAATGAATTATCAAGATCCGTTATGGCCATCTAGTCTCTTTCTTGTACACT 

AGTTGTCTAACACAACCAACAAATTAGAATATATATCGCAATGATTTTCAAAATATTGTG 

TAGTTTACTACTGGTAACCTCCAACTTCGCTTCTGCCTTATATGTCAATGAAACTACGAG 

CTATACACCATACACGAAGACATTAACTCCAACATACTCTGTTTCACCTCAAGAGACAAC 

ATTAACGTACAGCGATGAAACAACCACCTTCTACATAACATCTACTTTTTACTCTACCTA 

CTGGTTCACTACCTCCCAATCAGCTGCTATTATTAGTACACCTACTGCAAGTACACCTAC 

TGCAAGCACGCCTAGCCTAACTACGTCCACAAATGAATACACCACCACCTATTCTGACAC 

AGACACCACCTACACGTCTACTCTGACCTCTACTTACATAATAACTCTATCTACGGAATC 

CGCCAACGAGAAGGCTGAACAGATTTCCACGAGCGTCACAGAAATTGCTTCTACAGTAAC 

CGAATCGGGCAGTACATACACCTCTACTTTGACCTCAACCTTATTGGTTACTGTATATAA 

TTCCCAAGCTAGTAATACAATAG CG AC ATCCACAG CTGGGGACGCCGCCTCCAATGTTGA 

TGCCTTAGAAAAGTTAGTCTCTGCTGAACATCAATCTCAGATGATTCAAACCACATCCGC 

CGATGAACAGTACTGTAGTGCGTCTACCAAGTATGTTACAGTTACAGCTGCTGCAGTTAC 

CGAAGTGGTTACTACTACGGCGGAGCCTGTTGTTAAATACGTTACTATAACTGCCGATGC 

TAGTAATGTTACAGGTTCTGCTAACAACGGTACCCACATTTAATGCGTGACGTTGAATCG 

AGAAAAAAAGCTACTTTTAACGAAACCTTTACTAGTTATCCTATATGGGATCACTAGTAT 

TTTTTGATTTACGATTCAATAAATAGACTAGAGACAACTTTGATATCATTCCTTAAAAAA 

TACATAAAGCGCAAATTCAACCCCATTGATACATATATAAGTAGTTCTATTATGACTTTC 

AAGAACAATAGTAGCTTTTCTAAATAATCAATAAGTAGCACAAAATCTOTCTGTTTGTAC 

GCTTATATTTAGTTTGCGTTTATTTGCX3AGCGCCACGAGAAGGGGCAGGAA 

AATAGTTTGCAATAAAC ATCG AATGATG ATTTCAAC CAC CG ATACATAAACCAGCGAGG C 

TTTCAAGGAAGAATGAACGTGAACrCGTGAACTCAAAAAGAAAAT 

GAAATT AGATT CTG ATGTTTCTGAAAGACTTAAATCTCAGGCATG CACGGTAT CGCTAGC 
ATCAGCGGTTAGAGAAATAGTTCAAAATTCTGTAGATGCACACGCTACCACTATCGACGT 
CATGATCGACCTCCCTAATTTGAGCTTTKSGAGTTTACGAT^ 

AAGTGACCTAAATATATTGG CCACACAAAATTATACTT CC AAAAT ACGAAAG ATG AATGA 
TTTAGTAACGATGAAAACCTACGGTTACAGAGGAGACGCCCTATATAGCATTTCTAATGT 
CTCTAATCTGTTTGTTTGTTCCAAGAAA7VAGGATTACAACTCTGCATGGATGAGAA7UVTT 
TCCATCCAAAAGCGTCATGTTGAGTGAGAATACCATACTCCCAATAGATCCTTTTTGGAA 
AATTTGTCCTTGGAGCCGAACAAAGTCTGGTACTGTTGTTATTGTTGAAGATATGCTGTA 
TAATTTACCTGTCCGGCGCAGAATACTAAAGGAAGAACCCCCTTTCAAGACTTTTAACAC 
AATAAAGGCAGATATGCTACAGA 



FIG. 35 
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SVS1 (YPL163C) Protein Sequence: 

MIFKILCSLLLVTSNFASAIiYVNETTSYTPYTKTLTPTYSVSPQETTLTYSDETTTFYIT 

STFYSTYWFTTSQSAAIISTPTASTPTASTPSLTTSTNEYTTTYSDTDTTYTSTLTSTYI 

ITLSTESANEKAEQISTSVTEIASTVTESGSTYTSTIiTSTLLVTVYNSQASNTIATSTAG 

DAASNVDAIiEKLVSAEHQSQMIQTTSADEQYCSASTKYVTVTAAAVTEVVTTT^ 

VTITADASNVTGSANNGTHI 
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<110> Rosetta Inpharmatics, Inc. 



<12 0> METHODS FOR IDENTIFYING PATHWAY- SPECIFIC REPORTERS AND 
TARGET GENES , AND USES THEREOF 



<130> 9301-040-228 
<140> 

<141> 2000-03-29 

<150> 09/282,243 
<151> 1999-03-31 

<160> 30 

<170> Patentln Ver . 2.0 

<210> 1 
<211> 2385 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (351) . . (2282) 
<400> 1 

tctagttttc taatcatata tctttttata atataatacc aatagaataa aaatgtataa 60 

actgacattg cattcggtct ttacgactct cgctttatcc attcagcctt tttttttttt 120 

tttttttttt ctctatctgc taaacgagta gtagtataat caaaaatgtg ttatttagta 180 

tatcggttgt aaaggagaaa gtatggtctc tctattttta ttttattaac gaaaaatact 240 

aaacgccgat ggggattact atataattat aatagtattt gcagaatagt agaattcttt 300 

tcacagttca cgttcagttt ctcctctgtt ttatcgaacg tttattcatc atg tec 356 

Met Ser 
1 

aag gtc tat ctg aat tea gac atg att aac cat ttg aac tec aca gtt 404 
Lys Val Tyr Leu Asn Ser Asp Met lie Asn His Leu Asn Ser Thr Val 
5 10 15 

caa get tac ttt aac tta tgg ttg gag aag caa aac gca ata atg cgt 452 
Gin Ala Tyr Phe Asn Leu Trp Leu Glu Lys Gin Asn Ala lie Met Arg 
20 25 30 



1 
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tct caa ccc caa att att caa gat aac caa aaa ctg ata ggc att aca 500 
Ser Gin Pro Gin He He Gin Asp Asn Gin Lys Leu He Gly He Thr 
35 40 45 50 



acg eta gtt gec tea att ttc act ctg tat gtt ttg gtc aag ata ate 
Thr Leu Val Ala Ser He Phe Thr Leu Tyr Val Leu Val Lys He He 
55 60 65 

tec acc cca gca aag tgt tec teg tec tat aag cca gtc aaa ttc tec 
Ser Thr Pro Ala Lys Cys Ser Ser Ser Tyr Lys Pro Val Lys Phe Ser 
70 75 80 

ctt cct gca cca gag gee get caa aat aat tgg aag ggc aag agg tct 
Leu Pro Ala Pro Glu Ala Ala Gin Asn Asn Trp Lys Gly Lys Arg Ser 
85 90 95 

gtt tec act aac ata tgg aat cct gaa gaa cca aac ttt att caa tgt 
Val Ser Thr Asn He Trp Asn Pro Glu Glu Pro Asn Phe He Gin Cys 
100 1° 5 110 

cat tgt ccc gec aca ggt caa tat eta ggt tct ttt cca teg aaa acg 
His Cys Pro Ala Thr Gly Gin Tyr Leu Gly Ser Phe Pro Ser Lys Thr 
U5 120 125 "0 

gaa get gac ata gat gaa atg gtt tct aag gca ggc aaa get caa tct 
Glu Ala Asp He Asp Glu Met Val Ser Lys Ala Gly Lys Ala Gin Ser 
135 140 145 

act tgg ggc aat tct gat ttc tea aga aga ttg aga gtt ttg get tct 
Thr Trp Gly Asn Ser Asp Phe Ser Arg Arg Leu Arg Val Leu Ala Ser 
150 155 160 

ttg cat gat tat att eta aat aat caa gat ctt att gcg aga gta gcg 
Leu His Asp Tyr He Leu Asn Asn Gin Asp Leu He Ala Arg Val Ala 
165 I 70 175 

tgc agg gat tea gga aag aca atg tta gac gca teg atg ggt gaa ate 
Cys Arg Asp Ser Gly Lys Thr Met Leu Asp Ala Ser Met Gly Glu He 
180 185 190 

ttg gtt act tta gaa aaa att caa tgg act ata aag cac ggc caa aga 
Leu Val Thr Leu Glu Lys He Gin Trp Thr He Lys His Gly Gin Arg 
195 200 205 210 

gcg ttg caa cct teg aga cgt ccg ggc ccc act aat ttt ttc atg aag 
Ala Leu Gin Pro Ser Arg Arg Pro Gly Pro Thr Asn Phe Phe Met Lys 
215 220 225 



548 



596 



644 



692 



740 



788 



836 



884 



932 



980 



1028 



2 
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tgg tat aaa ggt gca gaa ate cgt tat gaa cca ctg ggt gtg ate agt 
Trp Tyr Lys Gly Ala Glu lie Arg Tyr Glu Pro Leu Gly Val lie Ser 
230 235 240 

tct ate gtt tec tgg aac tat cca ttc cat aac tta ttg ggt cca att 
Ser He Val Ser Trp Asn Tyr Pro Phe His Asn Leu Leu Gly Pro He 
245 250 255 

att gca gca ttg ttc aca ggg aat gec att gta gta aaa tgt tea gaa 
lie Ala Ala Leu Phe Thr Gly Asn Ala He Val Val Lys Cys Ser Glu 
260 265 270 

caa gtt gtc tgg tct teg gaa ttt ttc gtc gag ctg ate cgc aaa tgt 
Gin Val Val Trp Ser Ser Glu Phe Phe Val Glu Leu He Arg Lys Cys 

ttg gaa get tgt gat gaa gat cca gat ttg gtt cag ttg tgc tat tgt 
Leu Glu Ala Cys Asp Glu Asp Pro Asp Leu Val Gin Leu Cys Tyr Cys 
295 300 305 

tta cct cca act gaa aat gat gat tec gca aat tat ttc acc tct cat 
Leu Pro Pro Thr Glu Asn Asp Asp Ser Ala Asn Tyr Phe Thr Ser Hxs 



X076 



1124 



1172 



1220 



1268 



1316 



310 



315 320 



cct ggt ttc aaa cat ate act ttt att ggc agt cag ccc gta gcg cae 
Pro Gly Phe Lys His He Thr Phe He Gly Ser Gin Pro Val Ala Hxs 

330 335 



325 



tat att eta aaa tgc get gec aaa tea ttg aca ccc gta gtt gtg gag 
Tyr He Leu Lys Cys Ala Ala Lys Ser Leu Thr Pro Val Val Val Glu 

345 350 



340 



ett ggt ggt aag gat gcg ttt att gtc eta gac tea get aag aat tta 
Leu Gly Gly Lys Asp Ala Phe He Val Leu Asp Ser Ala Lys Asn Leu 
355 360 365 

gat get tta tct tct ate ate atg agg ggt act ttc caa tea tec ggt 
Lp Ala Leu Ser Ser He He Met Arg Gly Thr Phe Gin Ser Ser Gly 



375 



380 385 



caa aat tgt att ggt att gag agg gtt att gtc agt aag gaa aat tat 
Gin Asn Cys He Gly He Glu Arg Val He Val Ser Lys Glu Asn Tyr 
390 395 400 



1364 



1412 



1460 



1508 



155.6 



gat gat tta gtc aag att ttg aat gac cgt atg act gca aat cca eta 
Asp Asp Leu Val Lys He Leu Asn Asp Arg Met Thr Ala Asn Pro Leu 
405 



1604 



410 415 
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1700 



1748 



cgc caa ggg tct gat att gat cat tta gaa aat gtt gat atg ggg gca 1652 
Arg Gin Gly Ser Asp lie Asp His Leu Glu Asn Val Asp Met Gly Ala 
420 425 430 

atg ata tct gac aac aga ttc gat gaa eta gaa get ttg gtt aaa gat 
Met lie Ser Asp Asn Arg Phe Asp Glu Leu Glu Ala Leu Val Lys Asp 
435 440 445 450 

get gtt gca aag gga get cgt tta ctt caa ggt ggt tec cgc ttc aaa 
Ala Val Ala Lys Gly Ala Arg Leu Leu Gin Gly Gly Ser Arg Phe Lys 
455 460 465 

cat cca aag tat cca caa ggt cat tat ttc caa cca act ctt ttg gtg 1796 
His Pro Lys Tyr Pro Gin Gly His Tyr Phe Gin Pro Thr Leu Leu Val 
470 475 480 

gat gtc act cca gaa atg aaa ata gca caa aac gaa gtg ttt ggc cca 1844 
Asp Val Thr Pro Glu Met Lys lie Ala Gin Asn Glu Val Phe Gly Pro 
485 490 495 

att tta gtc atg atg aaa get aag aat act gac cat tgt gta caa eta 1892 
He Leu Val Met Met Lys Ala Lys Asn Thr Asp His Cys Val Gin Leu 
500 505 510 

gee aac tct gcg cca ttt ggt eta ggt ggt tct gtg ttt ggt gcg gat 194 0 
Ala Asn Ser Ala Pro Phe Gly Leu Gly Gly Ser Val Phe Gly Ala Asp 
515 520 525 530 



ate aag gaa tgc aat tac gtc gca aat age eta caa act ggt aat gta 
He Lys Glu Cys Asn Tyr Val Ala Asn Ser Leu Gin Thr Gly Asn Val 
535 540 545 

gec att aat gat ttt get aca ttc tat gtt tgt caa tta cca ttt ggt 
Ala He Asn Asp Phe Ala Thr Phe Tyr Val Cys Gin Leu Pro Phe Gly 
550 555 560 

ggt ate aat ggt tea ggt tac ggt aaa ttt ggt ggt gaa gaa ggt ctt 
Gly He Asn Gly Ser Gly Tyr Gly Lys Phe Gly Gly Glu Glu Gly Leu 
565 570 575 

ttg ggt ttg tgc aat gee aaa agt gtc tgt ttt gat act ttg cct ttt 
Leu Gly Leu Cys Asn Ala Lys Ser Val Cys Phe Asp Thr Leu Pro Phe 
580 585 590 

gtc tec act caa att cca aaa cca tta gac tac cct att cgt aac aat 
Val Ser Thr Gin He Pro Lys Pro Leu Asp Tyr Pro He Arg Asn Asn 
595 600 605 610 



1988 



2036 



2084 



2132 



2180 



4 
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get aag get tgg aat ttt gta aag agt ttc ate gta gga get tat aca 2228 
Ala Lys Ala Trp Asn Phe Val Lys Ser Phe He Val Gly Ala Tyr Thr 
615 620 625 

aat tec aca tgg caa aga ata aag tea ctg ttc tct tta get aaa gaa 2276 
Asn Ser Thr Trp Gin Arg He Lys Ser Leu Phe Ser Leu Ala Lys Glu 
630 635 640 

gee age tagtttactt tagaggaagc aacaaactta tcaataattt ggtatttatt 2332 
Ala Ser 



attatataaa atgaactttt tatgtacaag atttatgatt ttttgattct ata 



2385 



<210> 2 
<211> 644 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 2 

Met Ser Lys Val Tyr Leu Asn Ser Asp Met He Asn His Leu Asn Ser 
15 10 * 5 

Thr Val Gin Ala Tyr Phe Asn Leu Trp Leu Glu Lys Gin Asn Ala He 
20 25 30 

Met Arg Ser Gin Pro Gin He He Gin Asp Asn Gin Lys Leu He Gly 
35 40 45 

He Thr Thr Leu Val Ala Ser He Phe Thr Leu Tyr Val Leu Val Lys 
50 55 60 

He He Ser Thr Pro Ala Lys Cys Ser Ser Ser Tyr Lys Pro Val Lys 
65 70 75 80 

Phe Ser Leu Pro Ala Pro Glu Ala Ala Gin Asn Asn Trp Lys Gly Lys 
85 90 95 

Arg Ser Val Ser Thr Asn He Trp Asn Pro Glu Glu Pro Asn Phe He 
100 105 HO 

Gin Cys His Cys Pro Ala Thr Gly Gin Tyr Leu Gly Ser Phe Pro Ser 
115 120 125 

Lys Thr Glu Ala Asp lie Asp Glu Met Val Ser Lys Ala Gly Lys Ala 
130 135 140 

Gin Ser Thr Trp Gly Asn Ser Asp Phe Ser Arg Arg Leu Arg Val Leu 
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145 



150 155 160 



Ala Ser Leu His Asp Tyr lie Leu Asn Asn Gin Asp Leu lie Ala Arg 
165 170 175 

Val Ala Cys Arg Asp Ser Gly Lys Thr Met Leu Asp Ala Ser Met Gly 
180 185 190 

Glu lie Leu Val Thr Leu Glu Lys lie Gin Trp Thr lie Lys His Gly 
195 200 205 

Gin Arg Ala Leu Gin Pro Ser Arg Arg Pro Gly Pro Thr Asn Phe Phe 
210 215 220 

Met Lys Trp Tyr Lys Gly Ala Glu He Arg Tyr Glu Pro Leu Gly Val 
225 230 235 240 

He Ser Ser He Val Ser Trp Asn Tyr Pro Phe His Asn Leu Leu Gly 
245 250 255 

Pro He He Ala Ala Leu Phe Thr Gly Asn Ala He Val Val Lys Cys 
260 265 270 

Ser Glu Gin Val Val Trp Ser Ser Glu Phe Phe Val Glu Leu He Arg 
275 280 285 

Lys Cys Leu Glu Ala Cys Asp Glu Asp Pro Asp Leu Val Gin Leu Cys 
290 295 300 

Tyr Cys Leu Pro Pro Thr Glu Asn Asp Asp Ser Ala Asn Tyr Phe Thr 
305 310 315 320 

Ser His Pro Gly Phe Lys His He Thr Phe He Gly Ser Gin Pro Val 
325 330 335 

Ala His Tyr He Leu Lys Cys Ala Ala Lys Ser Leu Thr Pro Val Val 
340 345 350 

Val Glu Leu Gly Gly Lys Asp Ala Phe He Val Leu Asp Ser Ala Lys 
355 360 365 

Asn Leu Asp Ala Leu Ser Ser He He Met Arg Gly Thr Phe Gin Ser 
370 375 380 

Ser Gly Gin Asn Cys He Gly He Glu Arg Val He Val Ser Lys Glu 
385 390 395 400 

Asn Tyr Asp Asp Leu Val Lys He Leu Asn Asp Arg Met Thr Ala Asn 
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Pro Leu Arg Gin Gly Ser Asp He Asp His Leu Glu Asn Val Asp Met 
420 425 430 

Gly Ala Met He Ser Asp Asn Arg Phe Asp Glu Leu Glu Ala Leu Val 
•435 440 445 

Lys Asp Ala Val Ala Lys Gly Ala Arg Leu Leu Gin Gly Gly Ser Arg 
450 455 460 

Phe Lys His Pro Lys Tyr Pro Gin Gly His Tyr Phe Gin Pro Thr Leu 

.-,r> 475 480 

465 470 475 

Leu Val Asp Val Thr Pro Glu Met Lys lie Ala Gin Asn Glu Val Phe 

485 490 495 

Gly Pro He Leu Val Met Met Lys Ala Lys Asn Thr Asp His Cys Val 
500 505 510 

Gin Leu Ala Asn Ser Ala Pro Phe Gly Leu Gly Gly Ser Val Phe Gly 
515 520 525 

Ala Asp He Lys Glu Cys Asn Tyr Val Ala Asn Ser Leu Gin Thr Gly 
530 535 540 

Asn Val Ala He Asn Asp Phe Ala Thr Phe Tyr Val Cys Gin Leu Pro 
545 550 

Phe Gly Gly He Asn Gly Ser Gly Tyr Gly Lys Phe Gly Gly Glu Glu 
565 570 575 

Gly Leu Leu Gly Leu Cys Asn Ala Lys Ser Val Cys Phe Asp Thr Leu 
580 585 590 

Pro Phe Val Ser Thr Gin He Pro Lys Pro Leu Asp Tyr Pro He Arg 
595 600 605 

Asn Asn Ala Lys Ala Trp Asn Phe Val Lys Ser Phe He Val Gly Ala 
610 615 620 

Tyr Thr Asn Ser Thr Trp Gin Arg He Lys Ser Leu Phe Ser Leu Ala 
625 

Lys Glu Ala Ser 
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<210> 3 
<211> 1944 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (801) - - (1841) 
<400> 3 

acgtacaaaa aagagcacgc tgctttattt atacttttgt gccacaagaa tgatcaacat 60 

caacataaat atcaactagt atctgcaaca catctgctcc acggaactaa acccgttgag 120 

cagtgccccg tggaaacgta aactatcgca aattgggatt aacaagccaa aaacagccaa 180 

gcaagattca cgaaaccgcg cctcgtttgg accccgaagg cccatttaac ggccggccgt 240 

tacaagcaag atcggcagag caaaccactc cccagcacca cagcacatca ctgcacgagc 300 

aacaataact agaacatggc agatagcgag gatacctctg tgatcctgca gggcatcgac 360 

acaatcaaca gcgtggaggg cctggaagaa gatggttacc tcagcgacga ggacacgtca 420 

ctcagcaacg agctcgcaga tgcacagcgt caatgggaag agtcgctgca acagttgaac 480 

aagctgctca actgggtcct gctgcccctg ctgggcaagt atataggtag gagaatggcc 540 

aagactctat ggagtaggtt cattgaacac tttgtataag tgtttgttgt ttatgtatcc 600 

gcatatagca gttataacag ataaatggca cttttcgcac acccgttgtt ttatctccga 660 

tagtacgtgg gcctttattt atggtcgttt aacgaaagaa cggcatcttg aattgagcag 720 

gtatttaaaa gataggacga gaaacaagca catgatctgt gtcgaaaaaa agtagcaaag 780 

agaaaaagta ggaggatagg atg aac agg aaa gta get ate gta acg ggt act 833 

Met Asn Arg Lys Val Ala lie Val Thr Gly Thr 
1 5 10 

aat agt aat ctt ggt ctg aac att gtg ttc cgt ctg att gaa act gag 881. 
Asn Ser Asn Leu Gly Leu Asn lie Val Phe Arg Leu He Glu Thr Glu 
15 20 25 

gac acc aat gtc aga ttg acc att gtg gtg act tct aga acg ctt cct 92 9 
Asp Thr Asn Val Arg Leu Thr He Val Val Thr Ser Arg Thr Leu Pro 
30 35 40 
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cga gtg cag gag gtg att aac cag att aaa gat ttt tac aac aaa tea 977 
Arg Val Gin Glu Val lie Asn Gin lie Lys Asp Phe Tyr Asn Lys Ser 



45 50 55 

ggc cgt gta gag gat ttg gaa ata gac ttt gat tat ctg ttg gtg gac 
Gly Arg Val Glu Asp Leu Glu He Asp Phe Asp Tyr Leu Leu Val Asp 

70 75 

60 65 

ttc acc aac atg gtg agt gtc ttg aac gca tat tac gac ate aac aaa 
Phe Thr Asn Met Val Ser Val Leu Asn Ala Tyr Tyr Asp lie Asn Lys 
80 85 90 

aag tac agg gcg ata aac tac ctt ttc gtg aat get gcg caa ggt ate 
Lys Tyr Arg Ala He Asn Tyr Leu Phe Val Asn Ala Ala Gin Gly He 
95 100 105 

ttt gac ggt ata gat tgg ate gga gcg gtc aag gag gtt ttc acc aat 
Phe Asp Gly He Asp Trp He Gly Ala Val Lys Glu Val Phe Thr Asn 
110 I" 120 

cca ttg gag gca gtg aca aat ccg aca tac aag ata caa ctg gtg ggc 
Pro Leu Glu Ala Val Thr Asn Pro Thr Tyr Lys He Gin Leu Val Gly 
125 130 135 

gtc aag tct aaa gat gac atg ggg ctt att ttc cag gee aat gtg ttt 
Val Lys Ser Lys Asp Asp Met Gly Leu He Phe Gin Ala Asn Val Phe 
140 145 150 155 

ggt ccg tac tac ttt ate agt aaa att ctg cct caa ttg acc agg gga 
Gly Pro Tyr Tyr Phe He Ser Lys He Leu Pro Gin Leu Thr Arg Gly 
160 165 I 70 

aag get tat att gtt tgg att teg agt att atg tec gat cct aag tat 
Lys Ala Tyr He Val Trp He Ser Ser He Met Ser Asp Pro Lys Tyr 
175 I 80 185 

ctt teg ttg aac gat att gaa eta eta aag aca aat gee tct tat gag 
Leu Ser Leu Asn Asp He Glu Leu Leu Lys Thr Asn Ala Ser Tyr Glu 
190 195 200 

ggc tec aag cgt tta gtt gat tta ctg cat ttg gec acc tac aaa gac 
Gly Ser Lys Arg Leu Val Asp Leu Leu His Leu Ala Thr Tyr Lys Asp 
205 210 215 

ttg aaa aag ctg ggc ata aat cag tat gta gtt caa ccg ggc ata ttt 
Leu Lys Lys Leu Gly He Asn Gin Tyr Val Val Gin Pro Gly He Phe 
220 225 230 235 



1025 



1073 



1121 



1169 



1217 



1265 



1313 



1361 



1409 



1457 



1505 
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aca age cat tec ttc tec gaa tat ttg aat ttt ttc acc tat ttc ggc 1553 
Thr Ser His Ser Phe Ser Glu Tyr Leu Asn Phe Phe Thr Tyr Phe Gly 
240 245 250 

atg eta tgc ttg ttc tat ttg gec agg ctg ttg ggg tct cca tgg cac 1601 
Met Leu Cys Leu Phe Tyr Leu Ala Arg Leu Leu Gly Ser Pro Trp His 
255 260 265 

aat att gat ggt tat aaa get gee aat gee cca gta tac gta act aga 1649 
Asn lie Asp Gly Tyr Lys Ala Ala Asn Ala Pro Val Tyr Val Thr Arg 
270 275 280 

ttg gee aat cca aac ttt gag aaa caa gac gta aaa tac ggt tct get 1697 
Leu Ala Asn Pro Asn Phe Glu Lys Gin Asp Val Lys Tyr Gly Ser Ala 
285 290 295 

acc tct agg gat ggt atg cca tat ate aag acg cag gaa ata gac cct 1745 
Thr Ser Arg Asp Gly Met Pro Tyr lie Lys Thr Gin Glu lie Asp Pro 
300 305 310 315 

act gga atg tct gat gtc ttc get tat ata cag aag aag aaa ctg gaa 1793 
Thr Gly Met Ser Asp Val Phe Ala Tyr lie Gin Lys Lys Lys Leu Glu 
320 325 330 

tgg gac gag aaa ctg aaa gat caa att gtt gaa act aga acc ccc att 1841 
Trp Asp Glu Lys Leu Lys Asp Gin He Val Glu Thr Arg Thr Pro He 
335 340 345 

taatatatct ctgegtacat atgtatatat atatatgtgt gtatatacat gtatgtctgt 1901 

atagaaaacg catatcaact gatatatata caegtgaage aaa 1944 



<210> 4 
<211> 347 
<212> PRT 

<213> Saccharomyces cerevisiae 



<400> 4 

Met Asn Arg Lys Val Ala He Val 
1 5 

Leu Asn He Val Phe Arg Leu He 
20 

Leu Thr He Val Val Thr Ser Arg 
35 40 



Thr Gly Thr Asn Ser Asn Leu Gly 
10 15 

Glu Thr Glu Asp Thr Asn Val Arg 
25 30 

Thr Leu Pro Arg Val Gin Glu Val 

45 



10 
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He Asn Gin lie Lys Asp Phe Tyr Asn Lys Ser Gly Arg Val Glu Asp 
50 

Leu Glu lie Asp Phe Asp Tyr Leu Leu Val Asp Phe Thr Asn Met Val 

70 7 5 80 

65 70 



He Asn Lys Lys Tyr Arg Ala He 
85 9° 



Ser Val Leu Asn Ala Tyr Tyr Asp 



145 



150 



He Ser Lys He Leu Pro Gin Leu Thr Arg 
165 



Asn Tyr Leu Phe Val Asn Ala Ala Gin Gly He Phe Asp Gly He Asp 
100 105 ' HO 

Trp lie Gly Ala Val Lys Glu Val Phe Thr Asn Pro Leu Glu Ala Val 
115 I 20 125 

Thr Asn Pro Thr Tyr Lys He Gin Leu Val Gly Val Lys Ser Lys Asp 
130 

Asp Met Gly Leu He Phe Gin Ala Asn Val Phe Gly Pro Tyr Tyr Phe 

155 160 

Gly Lys Ala Tyr He Val 
175 

Trp He Ser Ser He Met Ser Asp Pro Lys Tyr Leu Ser Leu Asn Asp 

185 

lie Glu Leu Leu Lys Thr Asn Ala Ser Tyr Glu Gly Ser Lys Arg Leu 
195 200 205 

Val Asp Leu Leu His Leu Ala Thr Tyr Lys Asp Leu Lys Lys Leu Gly 
210 215 220 

He Asn Gin Tyr Val Val Gin Pro Gly He Phe Thr Ser His Ser Phe 

->->C 240 

225 230 235 

Ser Glu Tyr Leu Asn Phe Phe Thr Tyr Phe Gly Met Leu Cys Leu Phe 
245 250 255 

Tyr Leu Ala Arg Leu Leu Gly Ser Pro Trp His Asn He Asp Gly Tyr 



260 



265 



270 



Lys Ala Ala Asn Ala Pro Val Tyr Val Thr Arg Leu Ala Asn Pro Asn 
275 280 285 

Phe Glu Lys Gin Asp Val- Lys Tyr Gly Ser Ala Thr Ser Arg Asp Gly 
290 295 300 
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Met Pro Tyr He Lys Thr Gin Glu He Asp Pro Thr Gly Met Ser Asp 
305 ' 310 315 320 

Val Phe Ala Tyr He Gin Lys Lys Lys Leu Glu Trp Asp Glu Lys Leu 
325 330 335 

Lys Asp Gin He Val Glu Thr Arg Thr Pro He 
340 345 



<210> 5 
<2H> 2754 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . - (2551) 
<400> 5 

gatggcaaac ctccgcaatg attggcgttc tagcggctat ccgaattcac aatcgacaag 60 
aagtacttct aacttacaca aggcaacgaa ataatatcac tctatgaaac tgccatttgg 120 
gtaataggag tatattgaac gacaccgggt caacaagcaa ctttcctaag ccttttacac 180 
ttcttcacat cattcaagat cgccttttaa cgagctacaa accttcacgt tcgttcttct 240 
atggaaacgt ttaagataac gttaaaacgt tctcaatcac agaatttaag atgattagaa 300 
atgttttcca agggataggg cgaagcacaa cctcgaaaaa tggcaaaatt ttagaatctt 360 
agccacctta acgtctactt agagccttag aaaagccatc aagattggtg gaatagttgt 420 
tgagggaact tagccgccac attctcgtag ccaaataaag cgaatctgac cattgtatgt 480 
ttctttttca ctggtatgat agcccaatgt gtttaaggaa agttaggaca acacacccga 54 0 
agaaggacgt cacccctgca ttcccaaacg agctatgaaa tagctctttc ctctacaagt 600 
aataacaaca acttttttgt ctgttttccg accgtttaac ttcagagatt aattttttca 660 
acgcgctttc gttgaacgtc gcaaattcgt ttagaataaa cgaaaggtga cagaaataga 720 
agattatagc catgcatacg cacataaatt gaaaactgtt tcgaggctga gtattccctg 780 
cgtctgcagc catcaggggt atgactctgc tacacgttta ctatattctt ggctaaacga 84 0 
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ttcattaacg aagcgatgag tagatcacac tcggcatacg agcacaaatt tgtatggggg 900 

gacggtcata tataaaaggg tgtatacgtt atccttgtta tacctgtcca aagaagtgca 960 

tttgtaactc acaacacaga cacatcctca ctttatcata atg act acg ttt agg 1015 

Met Thr Thr Phe Arg 
l 5 



cca eta tea agt ttt gaa aaa aaa att etc act caa tct ttg aat gac 
Pro Leu Ser Ser Phe Glu Lys Lys lie Leu Thr Gin Ser Leu Asn Asp 
10 15 20 

caa aga aat gga act att ttt teg agt aca tat tea aaa tct tta agt 
Gin Arg Asn Gly Thr He Phe Ser Ser Thr Tyr Ser Lys Ser Leu Ser 
25 30 35 

aga gaa aat gac get gac tgg cat tct gat gaa gtc acg etc gga aca 
Arg Glu Asn Asp Ala Asp Trp His Ser Asp Glu Val Thr Leu Gly Thr 
40 45 50 

aat tct tec aaa gat gat tct cgt ctg act ctg ccc eta ata gca aca 
Asn Ser Ser Lys Asp Asp Ser Arg Leu Thr Leu Pro Leu He Ala Thr 
55 60 65 

act ttg aag aga ttg att aaa teg caa ccg gca ttg ttt gca act gta 
Thr Leu Lys Arg Leu He Lys Ser Gin Pro Ala Leu Phe Ala Thr Val 
70 75 80 85 

aac gaa gaa tgg gaa ttc gag cca ttg aag cag ctg aaa act tec gat 
Asn Glu Glu Trp Glu Phe Glu Pro Leu Lys Gin Leu Lys Thr Ser Asp 

90 95 100 

att gtt aat gtg att gag ttt. gaa acc ata aaa gat aag gag gtc aat 
He Val Asn Val He Glu Phe Glu Thr lie Lys Asp Lys Glu Val Asn 
105 11° 115 

tgc cat tgg ggt gtt cca cct cct tat etc ttg cgt cat gee ttc aac 
Cys His Trp Gly Val Pro Pro Pro Tyr Leu Leu Arg His Ala Phe Asn 
120 125 130 

aag act aga ttt gtt ccc gga tea aat aaa cct tta tgg aca eta tat 
Lys Thr Arg Phe Val Pro Gly Ser Asn Lys Pro Leu Trp Thr Leu Tyr 
135 140 145 

gta att gac gaa gcg eta ttg gtt ttt cat ggt cac gac gta ttg ttt 
Val He Asp Glu Ala Leu Leu Val Phe His Gly His Asp Val Leu Phe 
150 155 160 165 



1063 



1111 



1159 



1207 



1255 



1303 



1351 



1399 



1447 



1495 
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gat ata ttt tea gca get aac ttt cac aaa tta ttt tta aaa gag tta 
Asp lie Phe Ser Ala Ala Asn Phe His Lys Leu Phe Leu Lys Glu Leu 
170 175 180 

aac gaa ate age aca gta aca cac tct gaa gat agg att ttg ttt gat 
Asn Glu lie Ser Thr Val Thr His Ser Glu Asp Arg lie Leu Phe Asp 
185 190 195 

gtc aat gac ate aat etc tea gaa tta aaa ttt ccc aaa teg ata tat 
Val Asn Asp lie Asn Leu Ser Glu Leu Lys Phe Pro Lys Ser lie Tyr 
200 205 210 

gat age gca aaa tta cac ctg ccc get atg aca cca caa ate ttc cac 
Asp Ser Ala Lys Leu His Leu Pro Ala Met Thr Pro Gin lie Phe His 
215 220 225 

aag caa act cag tea ttt ttc aaa tea ata tac tat aac act tta aaa 
Lys Gin Thr Gin Ser Phe Phe Lys Ser He Tyr Tyr Asn Thr Leu Lys 
230 235 240 245 

aga cct ttc ggc tat tta acc aat caa act tec etc age teg tea gta 
Arg Pro Phe Gly Tyr Leu Thr Asn Gin Thr Ser Leu Ser Ser Ser Val 
250 255 260 

tct gca aca cag ctg aaa aag tat aat gat att eta aat gcg cac acc 
Ser Ala Thr Gin Leu Lys Lys Tyr Asn Asp He Leu Asn Ala His Thr 
265 270 275 

tea tta tgc ggg aca aca gta ttt ggg ata gta aac aac caa agg ttt 
Ser Leu Cys Gly Thr Thr Val Phe Gly lie Val Asn Asn Gin Arg Phe 
280 285 290 

aac tat tta aag tea ate gtt aat caa gag cat ata tgt eta aga agt 
Asn Tyr Leu Lys Ser He Val Asn Gin Glu His He Cys Leu Arg Ser 
295 300 305 

ttc ate tgt ggt att gca atg ata tgt tta aaa cct etc gtt aag gat 
Phe He Cys Gly He Ala Met He Cys Leu Lys Pro Leu Val Lys Asp 
310 315 320 325 

ttc age ggt aca ata gta ttt act att ccc ata aat tta aga aac cac 
Phe Ser Gly Thr He Val Phe Thr He Pro He Asn Leu Arg Asn His 
330 335 340 

tta ggc tta ggt ggg tea ttg ggt etc ttc ttc aaa gaa eta agg gtc 
Leu Gly Leu Gly Gly Ser Leu Gly Leu Phe Phe Lys Glu Leu Arg Val 
345 350 355 



1543 



1591 



1639 



1687 



1735 



1783 



1831 



1879 



1927 



1975 



2023 



2071 
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gaa tgt cca ctt tct eta att gat gac gaa ctt tec gee aac gaa ttt 2119 
Glu Cys Pro Leu Ser Leu He Asp Asp Glu Leu Ser Ala Asn Glu Phe 
360 365 370 

ttg acc aac agt aac gat aac gag gat aat gat gat gag ttt aat gaa 2167 
Leu Thr Asn Ser Asn Asp Asn Glu Asp Asn Asp Asp Glu Phe Asn Glu 
375 380 385 

aga ttg atg gaa tat caa ttt aat aaa gtt aca aag cac gtt age ggt 2215 
Arg Leu Met Glu Tyr Gin Phe Asn Lys Val Thr Lys His Val Ser Gly 



390 



395 400 405 



ttt att atg gca aaa ctg agg agt tgg gaa aag aat ggg ttt aat gat 2263 
Phe He Met Ala Lys Leu Arg Ser Trp Glu Lys Asn Gly Phe Asn Asp 
410 415 420 

gac gat ata agg agg atg aag tat gac aat gac gac gat ttc cat ate 2311 
Asp Asp He Arg Arg Met Lys Tyr Asp Asn Asp Asp Asp Phe His He 
425 430 435 

caa aat tea agg aca aaa ttg att caa ate aat gat gtt tec gac ata 2359 
Gin Asn Ser Arg Thr Lys Leu He Gin He Asn Asp Val Ser Asp He 
440 445 450 

teg tta teg atg aac ggc gat gac aaa tct ttc aaa att gta agt acg 2407 
Ser Leu Ser Met Asn Gly Asp Asp Lys Ser Phe Lys He Val Ser Thr 
455 460 465 

gga ttt aca agt teg ata aat cgc ccc aca tta atg tct ctt tec tat 2455 
Gly Phe Thr Ser Ser He Asn Arg Pro Thr Leu Met Ser Leu Ser Tyr 
470 475 480 485 

aca tac tgt gaa gag atg ggc ctg aat ate tgt att cac tac cct gat 2503 
Thr Tyr Cys Glu Glu Met Gly Leu Asn He Cys He His Tyr Pro Asp 
490 495 500 

teg tat aat tta gaa tct ttt gta gaa tgc ttc gaa tec ttt att gaa 2551 
Ser Tyr Asn Leu Glu Ser Phe Val Glu Cys Phe Glu Ser Phe He Glu 
505 510 515 

taggcaggtg aegcattaaa tatatgtctg tatagtacgt attttttcca ttttatttat 2611 
tcttatcaaa atttaatcaa catatatget aaagaaacta ttgataggag atatgacagg 2671 
aaattgeact gtttctggaa ctttggcatg ccgaggccgt catttccagt ataaetgagc 2731 

2754 

aaaaagaagt gaeggtaaat aca 
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<210> 6 
<211> S17 
<212> PRT 

<213> Saccharorayces cerevisiae 
<400> 6 

Met Thr Thr Phe Arg Pro Leu Ser Ser Phe Glu Lys Lys lie Leu Thr 
1 5 10 15 

Gin Ser Leu Asn Asp Gin Arg Asn Gly Thr lie Phe Ser Ser Thr Tyr 
20 25 30 

Ser Lys Ser Leu Ser Arg Glu Asn Asp Ala Asp Trp His Ser Asp Glu 
35 40 45 

Val Thr Leu Gly Thr Asn Ser Ser Lys Asp Asp Ser Arg Leu Thr Leu 
50 55 60 

Pro Leu lie Ala Thr Thr Leu Lys Arg Leu lie Lys Ser Gin Pro Ala 
65 70 75 80 

Leu Phe Ala Thr Val Asn Glu Glu Trp Glu Phe Glu Pro Leu Lys Gin 
85 90 95 

Leu Lys Thr Ser Asp He Val Asn Val He Glu Phe Glu Thr He Lys 
100 105 110 

Asp Lys Glu Val Asn Cys His Trp Gly Val Pro Pro Pro Tyr Leu Leu 
115 120 125 

Arg His Ala Phe Asn Lys Thr Arg Phe Val Pro Gly Ser Asn Lys Pro 
130 135 140 

Leu Trp Thr Leu Tyr Val He Asp Glu Ala Leu Leu Val Phe His Gly 
145 150 155 160 

His Asp Val Leu Phe Asp He Phe Ser Ala Ala Asn Phe His Lys Leu 
165 170 175 

Phe Leu Lys Glu Leu Asn Glu He Ser Thr Val Thr His Ser Glu Asp 
180 185 190 

Arg He Leu Phe Asp Val Asn Asp He Asn Leu Ser Glu Leu Lys Phe 
195 200 205 

Pro Lys Ser He Tyr Asp Ser Ala Lys Leu His Leu Pro Ala Met Thr 
210 215 220 
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245 250 . 255 

Gin Leu Lys Lys Tyr Asn Asp He 
260 265 ' 270 



Leu Asn Ala His Thr Ser Leu 
275 



Pro Gin He Phe His Lys Gin Thr Gin Ser Phe Phe Lys Ser He Tyr 

230 235 240 

225 230 

Tyr Asn Thr Leu Lys Arg Pro Phe Gly Tyr Leu Thr Asn Gin Thr Ser 
245 

Leu Ser Ser Ser Val Ser Ala Thr 

Cys Gly Thr Thr Val Phe Gly He Val 
280 285 

*» Asn Gin Arg Phe Asn Tyr Leu Lys Ser He Val Asn Gin Glu His 
290 295 300 

He Cys Leu Arg Ser Phe He Cys Gly He Ala Met He Cys Leu Lys 
305 

Pro Leu Val Lys Asp Phe Ser Gly Thr He Val Phe Thr He Pro He 
325 330 335 

Asn Leu Arg Asn His Leu Gly Leu Gly Gly Ser Leu Gly Leu Phe Phe 
340 345 350 

Lys Glu Leu Arg Val Glu Cys Pro Leu Ser Leu He Asp Asp Glu Leu 
355 360 365 

Ser Ala Asn Glu Phe Leu Thr Asn Ser Asn Asp Asn Glu Asp Asn Asp 
" 370 375 380 

Asp Glu Phe Asn Glu Arg Leu Met Glu Tyr Gin Phe Asn Lys Val Thr 
c — - 400 



385 



390 



395 



Lys His Val Ser Gly Phe He Met Ala Lys Leu Arg Ser Trp Glu Lys 
405 410 415 



Asn Gly Phe Asn Asp Asp Asp He Arg Arg Met Lys Tyr Asp Asn Asp 
420 «5 430 

Asp Asp Phe His He Gin Asn Ser Arg Thr Lys Leu He Gin He Asn 
435 "0 445 

Asp Val Ser Asp He Ser Leu Ser Met Asn Gly Asp Asp Lys Ser Phe 
450 

Lys He Val Ser Thr Gly Phe Thr Ser Ser He Asn Arg Pro Thr Leu 
465 470 _ 475 480 
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Met Ser Leu Ser Tyr Thr Tyr Cys Glu Glu Met Gly Leu Asn lie Cys 
485 490 495 

lie His Tyr Pro Asp Ser Tyr Asn Leu Glu Ser Phe Val Glu Cys Phe 
500 505 510 

Glu Ser Phe lie Glu 
515 



<210> 7 
<211> 1725 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (1522) 
<400> 7 

tgcaaaaact gataagggct ttcctgctga tgcgcttgct gattttgcgt atttgccgaa 60 
gattgattga tcaattgcgt aaaggggtcg tcttcttgac ggttgatatt gaatagcatg 120 
ttttgaatac gtagttgatt gacctctttc ttttaattgc gtgcagctgc tctcaggttt 180 
aagatgtacg agggtccacg gggtagcaag cacaagaacg atgatatata tgacagaacg 240 
atggataaga atggtatgtt gtctgcactg ttcagcattc gactacccct ctcccggttc 3 00 
ttttctcctc gtttcaattt aaaaaagcaa ctcgctaccc ggccgcacac cccttattcc 360 
tgttcagccg tttaaggtga gaacccttta cttcatagcc tttgtagatc tttctattgc 420 
taccattgaa gggtcggtga cgtggaaatt ttgacattta tcagtggcgt attgggaggc 4 80 
aagcaattga aagaactgtg atttatttcc gcttgttcga aattattgat gtttagcact 540 
ttgcagtagc gacaatacaa tatatgtgct tttagtgctg ggatagttcg tagctccatt 600 
tcggggcgct tgttacattt attgtatatg cgcggatgtg gcacatgctg ttgagatctc 660 
actcctttgg tatctctttc ctgcgccgca ttgtgccggc agaatgtcgc gcttgtattc 720 
tcatgaactt ttcctcttta cgaacccttt ggcggcatgc cgtttaaaat ctgttgaaga 780 
tttcctttac gaacaatgag caatgttttg cacaggcagg tgggaagtag ggcctatcgc 840 
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gccttggatg cagatataag tataaatata aattataata attggctgta tcagtaaatc 900 

cttcttgcga tgggaggaag cacgatagag tatgttaagc ttttgagagg cttcatattc 960 

attggaattt taaataacaa taaagcaaca acaataataa atg eta tea get gca 1015 

Met Leu Ser Ala Ala 



1 5 



gat aat tta gtg cge ate ata aat get gtt ttt ett att ata tee ata 
Asp Asn Leu Val Arg lie He Asn Ala Val Phe Leu He He Ser He 



1063 



10 " 2 0 



ggt eta ate age ggc ctg ata ggt aca cag aca aag cat agt tct cga 
Gly Leu He Ser Gly Leu He Gly Thr Gin Thr Lys His Ser Ser Arg 
25 30 35 

gtg aac ttt tgt atg ttt gee gee gtt tat ggt ctg gtt acg gat tea 
Val Asn Phe Cys Met Phe Ala Ala Val Tyr Gly Leu Val Thr Asp Ser 
40 45 50 

tta tat ggg ttt ttg get aat ttc tgg aca tea tta aca tac cca gca 
Leu Tyr Gly Phe Leu Ala Asn Phe Trp Thr Ser Leu Thr Tyr Pro Ala 
55 60 65 

att ttg ctt gtt ttg gat ttt tta aat ttc ata ttt acg ttt gta gca 
He Leu Leu Val Leu Asp Phe Leu Asn Phe He Phe Thr Phe Val Ala 
70 75 80 85 

gee ace get ttg get gta ggt ata aga tgc cat teg tgt aaa aac aaa 
Ala Thr Ala Leu Ala Val Gly He Arg Cys His Ser Cys Lys Asn Lys 
90 95 

aca tat ctg gaa cag aat aag ate ata caa ggc tea age tec aga tgt 
Thr Tyr Leu Glu Gin Asn Lys He He Gin Gly Ser Ser Ser Arg Cys 
105 110 115 

cat caa tct cag get get gtt gcg ttt ttt tac ttt tec tgt ttt eta 
His Gin Ser Gin Ala Ala Val Ala Phe Phe Tyr Phe Ser Cys Phe Leu 



1111 



1159 



1207 



1255 



1303 



1351 



1399 



120 



125 130 



ttc etc ate aaa gtg act gtg gee acg atg ggt atg atg caa aat ggt 
Phe Leu He Lys Val Thr Val Ala Thr Met Gly Met Met Gin Asn Gly 
135 140 1*5 

gga ttt ggc tct aat ace gga ttc age aga agg agg gca aga aga caa 
Gly Phe Gly Ser Asn Thr Gly Phe Ser Arg Arg Arg Ala Arg Arg Gin 
ISO 155 160 165 
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atg ggc ata cct aca att tec cag gtt taagectact ggactgaaaa 1542 
Met Gly lie Pro Thr lie Ser Gin Val 
170 

aaaggcaatt cgcgtacaat tttcgttgat cgttctttat ataacctttg cattaaataa 1602 
atttaacaaa aaaagttctt tctaaaataa tattatggtg atacatgaat gtgctttagt 1662 
tttttegtag gctcatccat gtatatatat aaatgataaa aaactaagtt acgatattga 1722 
tag 1725 



<210> 8 
<211> 174 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 8 

Met Leu Ser Ala Ala Asp Asn Leu Val Arg lie He Asn Ala Val Phe 
15 10 15 

Leu He He Ser lie Gly Leu He Ser Gly Leu He Gly Thr Gin Thr 
20 25 30 

Lys His Ser Ser Arg Val Asn Phe Cys Met Phe Ala Ala Val Tyr Gly 
35 40 45 

Leu Val Thr Asp Ser Leu Tyr Gly Phe Leu Ala Asn Phe Trp Thr Ser 
50 55 60 

Leu Thr Tyr Pro Ala He Leu Leu Val Leu Asp Phe Leu Asn Phe He 
65 70 75 80 

Phe Thr Phe Val Ala Ala Thr Ala Leu Ala Val Gly He Arg Cys His 
85 90 95 

Ser Cys Lys Asn Lys Thr Tyr Leu Glu Gin Asn Lys He He Gin Gly 
100 105 110 

Ser Ser Ser Arg Cys His Gin Ser Gin Ala Ala Val Ala Phe Phe Tyr 
115 120 125 

Phe Ser Cys Phe Leu Phe Leu He Lys Val Thr Val Ala Thr Met Gly 
130 135 140 

Met Met Gin Asn Gly Gly Phe Gly Ser Asn Thr Gly Phe Ser Arg Arg 
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145 



ISO 155 160 



Arg Ala Arg Arg Gin Met Gly lie Pro Thr lie Ser Gin Val 
165 170 



<210> 9 
<211> 1791 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) - . (1588) 
<400> 9 

gtagatgaat tcaaatctat gattaagaac aatgaattca ttgaatgggc gcaattctcc 60 
ggtaactact atggtagtac tgtcgcttcc gtcaaacaag tcagtaaatc tggtaagact 120 
tgtattttag atattgatat gcagggtgtc aaatctgtca aggctatccc agagttaaat 180 
gccaggtttt tgtttattgc tccaccatcg gtcgaggatt tgaaaaaaag attagaaggt 240 
agaggtacgg agaccgaaga atccatcaac aagaggttaa gcgccgctca agctgaattg 3 00 
gcatatgctg agacaggtgc ccatgacaaa gttattgtca atgatgattt ggacaaggcc 3 60 
tacaaggaat tgaaggattt tatctttgca gaaaaatgat gtagccctat atagacatta 420 
ctaagtatgt acctggtagg agagtgctgt cgcaaagcga caaaacgtcc aattattcaa 480 
ttaatatagt gtaaaagttc tcaacgggct tatgctagtt ttttttgtta gtaagcgcta 540 
cgacgactag aaccatctct tgaatttcca agtgccaaaa tcaatgacca cggatactgt 600 
ggccaggaat ctgttggttg gtcatcctca agatctagac aatatcatat tgggccagta 660 
tctgattatc ttaactatat gcgcccctct agtttacaag ttttagtcat tgggggttgg 720 
aagggctgat ccccccttac aattggcgtc gtttaggagc gggcgaggct ctcctttctc 780 
ttacacatct gctaaggtgt ttgttacccg agtaatcaag gatcaactat ggatgagatt 84 0 
tagattaacg tatttagagc agacgattgt aagaatatat tttgtaattt cgattgtttt 900 
ttgctaccta cattgtttat cttgaaatat ccaaagtgaa cactattact gttttttgct 960 
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caagaatata ttagccttac aagaacgtaa aaaaccaatc atg gta gca gaa gtt 

Met Val Ala Glu Val 



1 5 



1015 



1063 



caa aaa caa gcc cca cca ttt aag aaa acc gcc gta gtc gac ggt ate 
Gin Lys Gin Ala Pro Pro Phe l*ys Lys Thr Ala Val Val Asp Gly lie 
10 15 20 

ttc gag gaa att tea ctg gaa aag tat aaa ggt aag tac gtt gtt eta 
Phe Glu Glu lie Ser Leu Glu Lys Tyr Lys Gly Lys Tyr Val Val Leu 
25 30 35 

get ttt gtc cca ttg get ttt tea ttt gtc tgt cca act gag att gtt 
Ala Phe Val Pro Leu Ala Phe Ser Phe Val Cys Pro Thr Glu lie Val 
40 45 50 

gcg ttt tec gat gcc gcc aag aaa ttc gaa gat cag ggc gcc caa gtt 
Ala Phe Ser Asp Ala Ala Lys Lys Phe Glu Asp Gin Gly Ala Gin Val 
55 60 65 

tta ttt gcc tec acc gac tct gaa tat tec tta ctg gca tgg acc aac 
Leu Phe Ala Ser Thr Asp Ser Glu Tyr Ser Leu Leu Ala Trp Thr Asn 
70 75 80 85 

ctt ccc aga aaa gac ggt gga tta ggt cca gtt aaa gtt cct ttg ctt 
Leu Pro Arg Lys Asp Gly Gly Leu Gly Pro Val Lys Val Pro Leu Leu 
90 95 100 

get gat aag aat cat tec tta tec aga gac tat ggc gtt ttg att gaa 13 51 
Ala Asp Lys Asn His Ser Leu 'Ser Arg Asp Tyr Gly Val Leu He Glu 
105 HO 115 



1111 



1159 



1207 



1255 



1303 



aaa gaa ggt ata get tta aga^ggt ttg ttc ata ate gac ccg aag gga 
Lys Glu Gly He Ala Leu Arg Gly Leu Phe He He Asp Pro Lys Gly 
120 125 130 

ate att aga cat ate act ate aat gat tta tct gtt ggc aga aac gtc 
He He Arg His He Thr He Asn Asp Leu Ser Val Gly Arg Asn Val 
135 140 145 

aat gaa get ttg aga tta gtc gaa ggt ttc cag tgg act gac aaa aat 
Asn Glu Ala Leu Arg Leu Val Glu Gly Phe Gin Trp Thr Asp Lys Asn 
150 155 160 165 

ggt aca gtt ttg cca tgc aac tgg acc cca gga gcc gcc acc ate aaa 
Gly Thr Val Leu Pro Cys Asn Trp Thr Pro Gly Ala Ala Thr He Lys 
170 175 180 



1399 



1447 



1495 



1543 
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cct gac gtt aaa gat tec aag gag tat ttc aaa aat gec aat aat 1588 
Pro Asp Val Lys Asp Ser Lys Glu Tyr Phe Lys Asn Ala Asn Asn 
185 190 195 

taatcttege aegataaege taggecctat taaataatta aaaatacatc accctatata 1648 
tgataagaaa gatggttttg tattattatg aaattgactt gaaagaatag tgtaacaaaa 1708 
gaaaaagaaa ctgtaattga agaatgatat gcatttctat gtgtatatta acttaatcat 1768 
ctttatatcc agaagacgea aat 17 9 1 



<210> 10 
<211> 196 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 10 

Met Val Ala Glu Val Gin Lys Gin Ala Pro Pro Phe Lys Lys Thr Ala 
1 5 10 15 

Val Val Asp Gly lie Phe Glu Glu He Ser Leu Glu Lys Tyr Lys Gly 
20 25 30 

Lys Tyr Val Val Leu Ala Phe Val Pro Leu Ala Phe Ser Phe Val Cys 
35 40 45 

Pro Thr Glu He Val Ala Phe Ser Asp Ala Ala Lys Lys Phe Glu Asp 
50 55 60 

Gin Gly Ala Gin Val Leu Phe Ala Ser Thr Asp Ser Glu Tyr Ser Leu 
65 70 75 80 

Leu Ala Trp Thr Asn Leu Pro Arg Lys Asp Gly Gly Leu Gly Pro Val 
85 90 95 

Lys Val Pro Leu Leu Ala Asp Lys Asn His Ser Leu Ser Arg Asp Tyr 
100 105 110 

Gly Val Leu He Glu Lys Glu Gly He Ala Leu Arg Gly Leu Phe He 
115 120 125 

He Asp Pro Lys Gly He He Arg His He Thr He Asn Asp Leu Ser 
130 135 140 

Val Gly Arg Asn Val Asn Glu Ala Leu Arg Leu Val Glu Gly Phe Gin 
145 150 155 160 
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Trp.Thr Asp L.ys Asn Gly Thr Val Leu Pro Cys Asn Trp Thr Pro Gly 
165 170 175 

Ala Ala Thr lie Lys Pro Asp Val Lys Asp Ser Lys Glu Tyr Phe Lys 
180 185 190 

Asn Ala Asn Asn 
195 



<210> 11 
<211> 3455 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (2452) 
<400> 11 

gtggtgaaaa tgaaggaaat ttacaagatt gtggatgacg aagttgtcat ggacatgaga 60 
ttagtgagtc gggtcattgg taatcccttg ttaaaggaat caaaggagtt tcgtcaagat 120 
ttgaatgcca ggccattagc tagattggaa cgtttgaaaa tcttgataaa ctatgcagtt 180 
aagatctctc cgcataagga aaaattcccc tatgtgaggt ggacagtggg taaaaacaag 24 0 
tacatacatg agctcatggt cccagagcgc tttcccattg atattcccag agaaaatgtc 300 
: ;ggttagaaa gaactcagat tccattaatg ctatgctggg cactgtccat tcataaggca 360 
cagggtcaaa ctattcaaag actaaaggtc gacttgagga gaattttcga agccggccaa 420 
gtttatgttg cactgtcaag agcggtaact atggacacct tacaggtcct aaactttgat 480 
ccaggaaaga ttcgcaccaa tgaaagagta aaagatttct ataaacgttt agaaactttg 540 
aaatgacttg caacgaataa atgcatatac tctagttgaa gttttctttt cttgttctat 600 
acaggttcga atacttgtga gcctatctgt ataatttaac agaatcccga aatattcatc 660 
tagaagccat ctatttagct aagcctacgt atgcggcgat ttttatatta tctttttttt 72 0 
tttttataga agactgcgaa atgttggcag aatggaaagt ttcagtgtta aaaatagaaa 780 
ctgaaaaagg agate tagee aggaatatat cgaaaaaaaa agtgagggaa atcagatcct 840 
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acacaaatat ttagatttaa ttgaagaccc tggtctgcca gatatatata tatattagac 900 
gaactgtgca ttcagtcagc aaatctaggc cacagatttt cttattgaag ctatcaaaat 960 



agtagaaata attgaagggc gtgtataaca attctgggag atg get gat aag ata 

Met Ala Asp Lys He 
1 5 

gag agg cat act ttc aag gtc ttc aat caa gat ttc agt gta gat aag 
Glu Arg His Thr Phe Lys Val Phe Asn Gin Asp Phe Ser Val Asp Lys 
10 15 20 

agg ttt caa ctt ate aaa gaa ata ggg cat gga gca tac ggc ata gtg 
Arg Phe Gin Leu He Lys Glu He Gly His Gly Ala Tyr Gly He Val 
25 30 35 

tgt tea gcg egg ttt gca gaa get gee gaa gat acc aca gtt gee ate 
Cys Ser Ala Arg Phe Ala Glu Ala Ala Glu Asp Thr Thr Val Ala He 
40 45 50 

aag aaa gtg aca aac gtt ttt teg aag acc tta eta tgt aaa aga tec 
Lys Lys Val Thr Asn Val Phe Ser Lys Thr Leu Leu Cys Lys Arg Ser 
55 60 65 

eta cgt gag eta aag ctt ttg aga cat ttc aga ggc cac aaa aat att 
Leu Arg Glu Leu Lys Leu Leu Arg His Phe Arg Gly His Lys Asn He 
70 

aca tgt ctt tat gat atg gat att gtt ttt tat cca gac ggg tct ate 
Thr Cys Leu Tyr Asp Met Asp He Val Phe Tyr Pro Asp Gly Ser He 
90 95 100 

aat gga eta tat ctt tat gag gaa ctt atg gaa tgt gat atg cac caa 
Asn Gly Leu Tyr Leu Tyr Glu Glu Leu Met Glu Cys Asp Met His Gin 
105 110 115 

ate ate aaa tec ggt caa cct ttg acg gat get cac tat caa agt ttc 
He He Lys Ser Gly Gin Pro Leu Thr Asp Ala His Tyr Gin Ser Phe 
120 125 13° 

aca tac caa ata tta tgt ggt tta aag tat att cat tct gca gat gtc 
Thr Tyr Gin He Leu Cys Gly Leu Lys Tyr He His Ser Ala Asp Val 
135 140 145 

ttg cat cgt gat ttg aag ccc ggc aat ttg ctt gtc aat gca gat tgt 
Leu His Arg Asp Leu Lys Pro Gly Asn Leu Leu Val Asn Ala Asp Cys 
150 155 160 1«5 



1015 



1063 



1111 



1159 



1207 



1255 



1303 



1351 



1399 



1447 



1495 
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caa ttg aaa ate tgt gat ttt ggg tta get aga ggt tat teg gag aat 1543 
Gin Leu Lys lie Cys Asp Phe Gly Leu Ala Arg Gly Tyr Ser Glu Asn 
170 175 180 



cct gtc gaa aac agt caa ttt ttg acg gag tac gtg gec act aga tgg 
Pro Val Glu Asn Ser Gin Phe Leu Thr Glu Tyr Val Ala Thr Arg Trp 
185 190 195 

tat aga get ccg gaa ata atg ttg agt tac caa gga tat ace aag gcg 
Tyr Arg Ala Pro Glu lie Met Leu Ser Tyr Gin Gly Tyr Thr Lys Ala 
200 205 210 

att gac gta tgg tea get ggc tgt att tta gcg gag ttt ctt ggt gga 
lie Asp Val Trp Ser Ala Gly Cys lie Leu Ala Glu Phe Leu Gly Gly 
215 220 225 

aag cca ate ttc aaa gga aag gat tac gtt aat caa ttg aat caa ata 
Lys Pro lie Phe Lys Gly Lys Asp Tyr Val Asn Gin Leu Asn Gin lie 
230 235 240 245 

tta caa gtt tta ggg aca ccc cca gac gaa act tta aga agg att ggt 
Leu Gin Val Leu Gly Thr Pro Pro Asp Glu Thr Leu Arg Arg lie Gly 
250 255 260 

tct aaa aat gtt cag gac tac ata cat caa tta ggt ttc att cca aaa 
Ser Lys Asn Val Gin Asp Tyr He His Gin Leu Gly Phe He Pro Lys 
265 270 275 

gta cct ttt gtc aat tta tac cca aat gec aat tea caa gca tta gac 
Val Pro Phe Val Asn Leu Tyr Pro Asn Ala Asn Ser Gin Ala Leu Asp 
280 285 290 

tta ttg gag caa atg etc gcg ttt gac cct caa aag aga att ace gtg 
Leu Leu Glu Gin Met Leu Ala Phe Asp Pro Gin Lys Arg He Thr Val 
295 300 305 

gat gag gec ctg gag cat cct tac ttg tct ata tgg cat gat cca get 
Asp Glu Ala Leu Glu His Pro Tyr Leu Ser He Trp His Asp Pro Ala 
310 315 320 325 

gac gaa cct gtg tgt agt gaa aaa ttc gaa ttt agt ttt gaa teg gtt 
Asp Glu Pro Val Cys Ser Glu Lys Phe Glu Phe Ser Phe Glu Ser Val 
330 335 340 

aat gat atg gag gac tta aaa caa atg gtt ata caa gaa gtg caa gat 
Asn Asp Met Glu Asp Leu Lys Gin Met Val He Gin Glu Val Gin Asp 
345 350 355 
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ttc agg ctg t'tt gtg aga caa ccg eta tta gaa gag caa agg caa tta 2119 
Phe Arg Leu Phe Val Arg Gin Pro Leu Leu Glu Glu Gin Arg Gin Leu 
360 365 370 

caa tta cag cag cag caa cag cag cag caa cag caa cag caa cag caa 2167 
Gin Leu Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
375 380 385 

cag cag cct tea gat gtg gat aat ggc aac gec gca gcg agt gaa gaa 2215 
Gin Gin Pro Ser Asp Val Asp Asn Gly Asn Ala Ala Ala Ser Glu Glu 
390 395 400 405 

aat tat cca aaa cag atg gec acg tct aat tct gtt gcg cca caa caa 2263 
Asn Tyr Pro Lys Gin Met Ala Thr Ser Asn Ser Val Ala Pro Gin Gin 
410 415 420 

gaa tea ttt ggt att cac tec caa aat ttg cca agg cat gat gca gat 2311 
Glu Ser Phe Gly He His Ser Gin Asn Leu Pro Arg His Asp Ala Asp 
425 430 435 



ttc cca cct cga cct caa gag agt atg atg gag atg aga cct gee act 
Phe Pro Pro Arg Pro Gin Glu Ser Met Met Glu Met Arg Pro Ala Thr 



2359 



440 



445 450 



gga aat acc gca gat att ccg cct cag aat gat aac ggc acg ctt eta 2407 
Gly Asn Thr Ala Asp He Pro Pro Gin Asn Asp Asn Gly Thr Leu Leu 
455 460 465 



2452 



gac ctt gaa aaa gag ctg gag ttt gga tta gat aga aaa tat ttt 
Asp Leu Glu Lys Glu Leu Glu Phe Gly Leu Asp Arg Lys Tyr Phe 
470 475 480 

taggacaaaa aactataagt aaceggggaa gtatagaatc accatagatg taagcttaca 2512 
gacaatgtgt atatatgatg tatatgaacg tatacaaata tatatatata tacgtgctct 2572 
tgttgtagct cgtatatcaa attcctcctc egaegcttat ettaategta ctccgcggaa 2632 
gtttgttatc gectcttgaa ttctttcttt tcgttcattt atgattagtc atctatagac 2692 
aatattcatt atttaagcae ctagaatact aaaetaaatg tetaaatatg acacaaggaa 27S2 
gataagataa aaaaaaccaa gcgcttagaa tatgacttta atggtacctt tcaaacaagt 2812 
tgatgtattc actgagaagc cctttatggg aaatccagta gcagtaataa acttcttgga 2872 
aattgatgaa aatgaagtca gtcaagaaga attgeaggea attgecaaet ggacaaactt 2932 
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atcagaaaca acgtttttat ttaaaccatc tgataaaaag tatgattaca agttgaggat 2992 
ctttactcca agaagtgaat tgccatttgc tggtcaccca accattggtt catgtaaggc 3052 
tttccttgag ttcaccaaaa acaccactgc gacttctctc gtccaggaat gtaaaatagg 3112 
cgctgttcca ataacaatta atgagggact aattagcttc aaagctccga tggctgatta 3172 
cgaaagtata tcgagtgaga tgattgctga ttatgaaaaa gcgattggtt tgaaattcat 3232 
aaagcctcct gctcttttac atactgggcc agagtggatc gtggcgctag tagaagatgc 3292 
agaaacttgc ttcaatgcaa acccaaattt tgctatgctt gcacaccaga caaaacagaa 33 52 
tgaccatgtg ggaattatcc tagcgggccc taaaaaggaa gccgccatca aaaactccta 3412 
cgaaatgagg gcgtttgctc cggtgataaa cgtttatgaa gat 3455 



<210> 12 
<211> 484 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 12 

Met Ala Asp Lys He Glu Arg His Thr Phe Lys Val Phe Asn Gin Asp 
15 10 .15 

Phe Ser Val Asp Lys Arg Phe Gin Leu He Lys Glu He Gly His Gly 
20 25 30 

Ala Tyr Gly He Val Cys Ser Ala Arg Phe Ala Glu Ala Ala Glu Asp 
35 40 45 

Thr Thr Val Ala He Lys Lys Val Thr Asn Val Phe Ser Lys Thr Leu 
50 55 60 

Leu Cys Lys Arg Ser Leu Arg Glu Leu Lys Leu Leu Arg His Phe Arg 
65 70 75 80 

Gly His Lys Asn He Thr Cys Leu Tyr Asp Met Asp He Val Phe Tyr 
85 90 95 

Pro Asp Gly Ser He Asn Gly Leu Tyr Leu Tyr Glu Glu Leu Met Glu 
100 105 HO 

Cys Asp Met His Gin He He Lys Ser Gly Gin Pro Leu Thr Asp Ala 
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His Tyr Gin Ser Phe Thr Tyr Gin He Leu Cys Gly Leu Lys Tyr He 
130 135 140 

His Ser Ala Asp Val Leu His Arg Asp Leu Lys Pro Gly Asn Leu Leu 
145 150 155 160 

Val Asn Ala Asp Cys Gin Leu Lys He Cys Asp Phe Gly Leu Ala Arg 
165 170 175 

Gly Tyr Ser Glu Asn Pro Val Glu Asn Ser Gin Phe Leu Thr Glu Tyr 
180 185 190 

Val Ala Thr Arg Trp Tyr Arg Ala Pro Glu He Met Leu Ser Tyr Gin 
195 200 205 

Gly Tyr Thr Lys Ala He Asp Val Trp Ser Ala Gly Cys He Leu Ala 
210 215 220 

Glu Phe Leu Gly Gly Lys Pro He Phe Lys Gly Lys Asp Tyr Val Asn 
225 230 235 240 

Gin Leu Asn Gin He Leu Gin Val Leu Gly Thr Pro Pro Asp Glu Thr 
245 250 255 

Leu Arg Arg He Gly Ser Lys Asn Val Gin Asp Tyr He His Gin Leu 
260 265 270 

Gly Phe He Pro Lys Val Pro Phe Val Asn Leu Tyr Pro Asn Ala Asn 
27 5 280 285 

Ser Gin Ala Leu Asp Leu Leu Glu Gin Met Leu Ala Phe Asp Pro Gin 
290 295 300 

Lys Arg He Thr Val Asp Glu Ala Leu Glu His Pro Tyr Leu Ser He 
305 310 315 320 

Trp His Asp Pro Ala Asp Glu Pro Val Cys Ser Glu Lys Phe Glu Phe 
325 330 335 

Ser Phe Glu Ser Val Asn Asp Met Glu Asp Leu Lys Gin Met Val He 
340 345 350 

Gin Glu Val Gin Asp Phe Arg Leu Phe Val Arg Gin Pro Leu Leu Glu 
355 360 365 

Glu Gin Arg Gin Leu Gin Leu Gin Gin Gin Gin Gin Gin Gin Gin Gin 
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370 375 380 

Gin Gin Gin Gin Gin Gin Gin Pro Ser Asp Val Asp Asn Gly Asn Ala 
385 390 395 400 

Ala Ala Ser Glu Glu Asn Tyr Pro Lys Gin Met Ala Thr Ser Asn Ser 
405 410 415 

Val Ala Pro Gin Gin Glu Ser Phe Gly lie His Ser Gin Asn Leu Pro 
420 425 430 

Arg His Asp Ala Asp Phe Pro Pro Arg Pro Gin Glu Ser Met Met Glu 
435 440 445 

Met Arg Pro Ala Thr Gly Asn Thr Ala Asp lie Pro Pro Gin Asn Asp 
450 455 460 

Asn Gly Thr Leu Leu Asp Leu Glu Lys Glu Leu Glu Phe Gly Leu Asp 
465 470 475 480 

Arg Lys Tyr Phe 



<210> 13 
<211> 3302 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (2299) 
<400> 13 

agtaatactt gcaaatattg caaaacttgg aagaatgtta atgaatcatt tcttgcacca 60 
ttctttcaat catctcaatc tcctgctgtg atgtttaagt ataacattga agactatgcc 120 
ctaatttcca atgttattta gttttaagca tatctttgtt tctaacagga aactcaggcc 180 
cacatccgca aaaaaatatg tgccaaaaaa ctttcaacac ttcaaagata cttaccactg 240. 
caggaaaata atctacgtgt aacggtttga aaataaattt gacttcataa ttggacataa 300 
gtactccatc gccatccctt tttaaagaag tttccacaag aatgaatggc taatcgcaac 360 
taaatctttt ccttgcaaac gtaacacagt atcgacattt tcttactcaa tccaacgaag 420 
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gaataaccta tctaaaaaat aaacgccgta gttttcagcc cacaagacgt cattaaaaga 480 

tttgttaatt ataaaaatag aaatatttct accagcatga ttattcgtta cttgaaagtc 540 

cccaataaat ttcactgttt ccgttaactg ttgtagttat taaacgcagc aaacagatta 600 

ttttgaacaa caccggagaa acacgcgcag acccattcga gttaaaaata gtaactcgcg 660 

atcaatcaat gcaggaagca ccgtaggaat tagtaagaac tcgtattttg attgaaaatg 720 

ccatgaaagc aattgacttg ctgcagtaaa aagcgctg.cc acaaactttg taattttcga 780 

caatgacgtt cttttcagat ggttactgtc tttttttgga agaaacaaaa gaaggtactt 840 

ttatgatgtt atactaggca aaaagcctat ttaatgtaag teetuttgt cgtttgagac 900 

tggatgaaaa gggacaaaat ggaaggataa ctaaaggtga cttaccgcca gattaattcg 960 

gcctggaata gtttgatatc gaagaaagat tcacaattaa atg gcg act gac acc 1015 
3 sa Met Ala Thr Asp Thr 

1 5 

gag agg tgt att ttc cgt gca ttc ggc caa gat ttt ate eta aat aaa 1063 
Glu Arg Cys He Phe Arg Ala Phe Gly Gin Asp Phe He Leu Asn Lys 
10 " 20 

cat ttt cat ttg aca ggt aag att ggt egg ggc tea cac age ett att 1111 
His Phe His Leu Thr Gly Lys He Gly Arg Gly Ser His Ser Leu He 
25 30 35 

tgt tct tea act tac aca gaa teg aac gag gaa act cac gtg get ate 1159 
Cys Ser Ser Thr Tyr Thr Glu Ser Asn Glu Glu Thr His Val Ala lie 



40 



45 



50 



aga aaa ata cca aac gcg ttt ggc aat aaa eta tct tgc aag aga act 1207 
Arg Lys He Pro Asn Ala Phe Gly Asn Lys Leu Ser Cys Lys Arg Thr 



55 



60 65 



ctt cgt gaa ttg aaa eta eta aga eat tta aga ggg cac cca aat ata 
Leu Arg Glu Leu Lys Leu Leu Arg His Leu Arg Gly His Pro Asn He 

gtg tgg etc ttc gat act gat ata gta ttt tac cca aat ggg gca eta 1303 

Val Trp Leu Phe Asp Thr Asp He Val Phe Tyr Pro Asn Gly Ala Leu 

^ i no 



90 



95 100 



aat ggc gtt tat tta tat gaa gaa eta atg gaa tgt gac ctt tct caa 1351 
Asn Gly Val Tyr Leu Tyr Glu Glu Leu Met Glu Cys Asp Leu Ser Gin 
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105 110 115 

att ata agg tec gaa caa cgc ctg gaa gac gca cac ttt caa age ttc 13 99 
lie He Arg Ser Glu Gin Arg Leu Glu Asp Ala His Phe Gin Ser Phe 
120 125 130 

ata tat cag ata ctg tgt get ctg aaa tac ata cat tct get aat gtt 1447 
He Tyr Gin He Leu Cys Ala Leu Lys Tyr He His Ser Ala Asn Val 
135 140 145 

tta cat tgt gac ctg aaa cca aaa aac tta ctt gtt aat agt gat tgc 1495 
Leu His Cys Asp Leu Lys Pro Lys Asn Leu Leu Val Asn Ser Asp Cys 
150 155 160 165 

caa eta aaa att tgt aat ttt ggg eta teg tgt agt tat tea gaa aac 1543 
Gin Leu Lys He Cys Asn Phe Gly Leu Ser Cys Ser Tyr Ser Glu Asn 
170 175 180 

cac aag gtt aac gac ggc ttc att aag ggt tat ata ace teg ata tgg 1591 
His Lys Val Asn Asp Gly Phe He Lys Gly Tyr He Thr Ser He Trp 
185 190 195 

tat aaa gca cca gaa att ttg ctg aat tat caa gaa tgc aca aaa get 163 9 
Tyr Lys Ala Pro Glu He Leu Leu Asn Tyr Gin Glu Cys Thr Lys Ala 
200 205 210 

gtc gat att tgg tea aca ggc tgt ate ttg gee gaa eta ctt ggt agg 1687 
Val Asp He Trp Ser Thr Gly Cys He Leu Ala Glu Leu Leu Gly Arg 
215 220 225 

aaa cca atg ttt gaa ggg aag gat tat gta gat cat ttg aat cat att 1735 
Lys Pro Met Phe Glu Gly Lys Asp Tyr Val Asp His Leu Asn His He 
230 235 240 245 

eta caa ata ctt gga aca cca cct gag gaa aca ttg cag gaa att gee 1783 
Leu Gin He Leu Gly Thr Pro Pro Glu Glu Thr Leu Gin Glu He Ala 
250 255 260 

tct caa aag gtg tat aat tat ate ttt cag ttc ggt aat ate ccg gga 1831 
Ser Gin Lys Val Tyr Asn Tyr He Phe Gin Phe Gly Asn He Pro Gly 
265 270 275 

aga teg ttt gaa age ata eta cct ggt get aat cca gaa gcg ctt gaa 187 9 
Arg Ser Phe Glu Ser He Leu Pro Gly Ala Asn Pro Glu Ala Leu Glu 
280 285 290 



ttg eta aag aaa atg eta gaa ttt gat cct aaa aaa agg att act gta 
Leu Leu Lys Lys Met Leu Glu Phe Asp Pro Lys Lys Arg He Thr Val 



1927 
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29S 300 305 

gag gat gca eta gag cat cca tat ttg tea atg tgg cat gat ata gat 1975 
Glu Asp Ala Leu Glu His Pro Tyr Leu Ser Met Trp His Asp lie Asp 
310 ' 315 320 325 

gag gaa ttc tea tgt caa aag acc ttt aga ttc gaa ttc gag cat ate 2023 
Glu Glu Phe Ser Cys Gin Lys Thr Phe Arg Phe Glu Phe Glu His He 
330 335 340 

gaa agt atg gcg gaa tta gga aac gaa gtt ata aag gaa gta ttt gat 2071 
Glu Ser Met Ala Glu Leu Gly Asn Glu Val He Lys Glu Val Phe Asp 
345 350 355 

ttc agg aaa gtt gtt aga aaa cat cct att age ggt gat tec cca tea 2119 
Phe Arg Lys Val Val Arg Lys His Pro He Ser Gly Asp Ser Pro Ser 
360 365 370 

tea tea eta tct tta gag gat gee att cct caa gaa gtt gta cag gtc 2167 
Ser Ser Leu Ser Leu Glu Asp Ala He Pro Gin Glu Val Val Gin Val 
375 3 80 385 

cat cct tct agg aaa gtt tta ccc agt tat agt cct gaa ttt tec tat 2215 
His Pro Ser Arg Lys Val Leu Pro Ser Tyr Ser Pro Glu Phe Ser Tyr 
390 395 400 405 

gta age caa ctt cca tea eta act aca acc cag cca tat caa aac ctt 2263 
Val Ser Gin Leu Pro Ser Leu Thr Thr Thr Gin Pro Tyr Gin Asn Leu 
410 415 420 

atg gga ata age tct aat tea ttt cag ggt gtt aac taaaaggaaa 2309 
Met Gly He Ser Ser Asn Ser Phe Gin Gly Val Asn 
425 430 

acaccttcaa acaagatact aagcatgaaa atagtgaact actgaaegga cctactgagc 2369 

caaatataac aaaaatgagc ccagtttcat cgtctccccc aggtcacgat ataaatgtca 242 9 

atgatggtac aaaccaaaat acaaatgagg atgacagega ttttttcttc gacctagaaa 2489 

aagaacttga attatttaga cgataaattt ttgtagcaga aaaccacaac taatagatgc 2549 

gcacatacac tatctataat gaatatgtaa aatgcctgtt caccttctta attattggta 2609 

tatacttcaa atattgeaaa aagagaaagt cctctcggcg gttttgcagt tccttccgaa 2669 

agegggaaaa accaaaatgt gagaaagtag gatacaccat tgegtagatt cgegatgate 272 9 
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cgaatataaa catgattccc tcgtcagtcc tctctcaagt tttctttccc gttttaaata 2789 
gcttactaat attttcacaa aaaagttgat atcatttaaa ggtgcttttg gcgggattga 2849 
atgatgaaaa gattacaccc cttgagaatt caagttcatc tgaaatctga ttacccactg 2909 
tttactttcg agcaattact ctctacaaat gggataagaa gaggccaaac tgcgagaatt 2 969 
tctttgaaag attacataga gtggcaaaat ttcccaaaca taatgaaaag agaaaatttt 3 02 9 
tttacgcaaa ggaagcctgt aactacaacc gcaaaagaag aacccttttc atttgataac 3089 
attcttgact gtgagccaca atttagcaaa tgccttgcca aatggctact ggttaattac 3149 
aaattaaatg actatcctta ttacgatctt aacattgtga atatttacac ggatttaccc 3209 
caagcaattc agatttgcaa aaatttaatg tcatatctca agtctacttt atctgataac 3269 
atgttccaga aaataaaata tttcatggta cct 3302 



<210> 14 
<211> 433 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 14 

Met Ala Thr Asp Thr Glu Arg Cys lie Phe Arg Ala Phe Gly Gin Asp 
15 10 I 5 

Phe He Leu Asn Lys His Phe His Leu Thr Gly Lys He Gly Arg Gly 
20 25 3 0 

Ser His Ser Leu He Cys Ser Ser Thr Tyr Thr Glu Ser Asn Glu Glu 
35 40 45 

Thr His Val Ala He Arg Lys He Pro Asn Ala Phe Gly Asn Lys Leu 
50 55 60 

Ser Cys Lys Arg Thr Leu Arg Glu Leu Lys Leu Leu Arg His Leu Arg 
65 70 75 80 

Gly His Pro Asn He Val Trp Leu Phe Asp Thr Asp He Val Phe Tyr 

85 90 95 

Pro Asn Gly Ala Leu Asn Gly Val Tyr Leu Tyr Glu Glu Leu Met Glu 
100 105 HO 
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Cys Asp Leu Ser Gin He He Arg Ser Glu Gin Arg Leu Glu Asp Ala 
115 120 I 25 

His Phe Gin Ser Phe He Tyr Gin He Leu Cys Ala Leu Lys Tyr He 
130 "5 140 

His Ser Ala Asn Val Leu His Cys Asp Leu Lys Pro Lys Asn Leu Leu 
145 150 155 160 

Val Asn Ser Asp Cys Gin Leu Lys He Cys Asn Phe Gly Leu Ser Cys 
165 170 175 

Ser Tyr Ser Glu Asn His Lys Val Asn Asp Gly Phe He Lys Gly Tyr 
180 185 190 

He Thr Ser He Trp Tyr Lys Ala Pro Glu He Leu Leu Asn Tyr Gin 
195 200 205 

Glu Cys Thr Lys Ala Val Asp He Trp Ser Thr Gly Cys He Leu Ala 
210 215 220 

Glu Leu Leu Gly Arg Lys Pro Met Phe Glu Gly Lys Asp Tyr Val Asp 
225 230 235 240 

His Leu Asn His He Leu Gin He Leu Gly Thr Pro Pro Glu Glu Thr 
245 250 255 

Leu Gin Glu He Ala Ser Gin Lys Val Tyr Asn Tyr He Phe Gin Phe 
260 265 270 

Gly Asn He Pro Gly Arg Ser Phe Glu Ser He Leu Pro Gly Ala Asn 
275 280 285 

Pro Glu Ala Leu Glu Leu Leu Lys Lys Met Leu Glu Phe Asp Pro Lys 
290 295 300 

Lys Arg He Thr Val Glu Asp Ala Leu Glu His Pro Tyr Leu Ser Met 
305 310 315 320 

Trp His Asp He Asp Glu Glu Phe Ser Cys Gin Lys Thr Phe Arg Phe 



325 



330 335 



Glu Phe Glu His He Glu Ser Met Ala Glu Leu Gly Asn Glu Val He 
340 345 350 

Lys Glu Val Phe Asp Phe Arg Lys Val Val Arg Lys His Pro He Ser 
355 360 365 
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Gly Asp Ser Pro Ser Ser Ser Leu Ser Leu Glu Asp Ala lie Pro Gin 
370 375 380 

Glu Val Val Gin Val His Pro Ser Arg Lys Val Leu Pro Ser Tyr Ser 
385 390 395 400 

Pro Glu Phe Ser Tyr Val Ser Gin Leu Pro Ser Leu Thr Thx Thr Gin 
405 410 415 

Pro Tyr Gin Asn Leu Met Gly lie Ser Ser Asn Ser Phe Gin Gly Val 
420 425 430 

Asn 



<210> 15 
<211> 2978 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (1975) 
<400> 15 

tctggcttcg aggaattatt acctaaatag gaaaggcaga atatattaga aaaaaaagaa 60 
aaaccaaatg agaaaagtgc tggtgctaaa taaaacatta ttgaggggcc aagaggggac 120 
aaaagaagat ataactagat cattaagttt tcgctctagt aacaggaaca aagattgtga 180 
gatacactgt tatgctaaga gacggtgcga tattctgtac gaaaattatt taactattaa 24 0 
ctaaatgtat accacttcac gtgccaccga gtaggtttct aaaatgtgca accattttag 300 
gtatgtgcgc agctctttat tctaaacggg agtcactaca ttactattat cgtgtttttg 360 
cccatgtact ttcttataat cttaagacaa caacgggatg ataggcgcat tcggactttc 420 
attgatgcaa atgtgtgaaa aatgcatcca aaagacaact tttgtacaga atacaattgc 480. 
aaaaatactt tacgggcata gatcggtaag gtcaccggga agctagcgta agagacctta 54 0 
ttcggaaccg agcaaccatt tccgaatgta gtagtagttg aaggagtaaa tcgaccttat 600 
tgtacactac ttcctttaaa tttgatttct ggccccgcgc aatttcttgg cggttaagct 660 
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gtatttttac ctcatcggga aaagttattg 
ctaattatag atcaaaggcc gagggctttc 
atagaaggga taaatcatgc atctccagga 
ccaaatcaag cctatataag attctcgtca 
tttgtctaac actgaaactg taacctaaga 
ataagaaatc tcataaaaca agtactgttt 



caagttaaag gggatcaaac gattagcaaa 720 

taaatttggc atatttcgcc gtcgactgaa 780 

ttatccctac tccattcatt acaacatgcg 84 0 

tttagcatgc tctattgatt tgtgtcttgt 900 

tttctttaga taattattac atttacatca 960 

ataagtaaaa atg caa tat aaa aag 1015 
Met Gin Tyr Lys Lys 
1 5 



cca tta gtc gtc tec get tta get get aca tct tta get gee tat get 1063 
Pro Leu Val Val Ser Ala Leu Ala Ala Thr Ser Leu Ala Ala Tyr Ala 
10 15 20 



cca aag gac ccg tgg tec act tta act cca tea get act tac aag ggt 
Pro Lys Asp Pro Trp Ser Thr Leu Thr Pro Ser Ala Thr Tyr Lys Gly 
25 30 35 

ggt ata aca gat tac tct teg agt ttc ggt att get att gaa gee gtg 
Gly lie Thr Asp Tyr Ser Ser Ser Phe Gly He Ala He Glu Ala Val 
40 45 50 



1111 



1159 



get ace agt get tec tec gtc gee tea tct aaa gca aag aga gee gee 1207^ 
Ala Thr Ser Ala Ser Ser Val Ala Ser Ser Lys Ala Lys Arg Ala Ala 
55 60 65 



tct cag ata ggt gat ggt caa gta cag get gee act act act get get 1255 
Ser Gin He Gly Asp Gly Gin Val Gin Ala Ala Thr Thr Thr Ala Ala 
70 75 80 85 



gtt tct aag aaa tec ace get get get gtt tct caa ata act gac ggt 
Val Ser Lys Lys Ser Thr Ala Ala Ala Val Ser Gin He Thr Asp Gly 
90 95 100 



1303 



caa gtt caa get get aag tct act gee get get gtt tec caa ata act 1351 
Gin Val Gin Ala Ala Lys Ser Thr Ala Ala Ala Val Ser Gin He Thr 
105 HO H5 



gac ggt caa gtt caa get get aag tct act gee get gee gtt tct caa 
Asp Gly Gin Val Gin Ala Ala Lys Ser Thr Ala Ala Ala Val Ser Gin 
120 125 130 

ata act gac ggt caa gtt caa get get aag tct act gec get gee gtt 
He Thr Asp Gly Gin Val Gin Ala Ala Lys Ser Thr Ala Ala Ala Val 
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135 140 145 

tct caa ata act gat ggt caa gtt caa get gec aag tct act get gec 14 95 
Ser Gin lie Thr Asp Gly Gin Val Gin Ala Ala Lys Ser Thr Ala Ala 
150 155 160 165 

get gee tct cag att tct gac ggc caa gtt cag gec act acc tct act 1543 
Ala Ala Ser Gin lie Ser Asp Gly Gin Val Gin Ala Thr Thr Ser Thr 
170 175 180 

aag get get gca tec caa att aca gat ggg cag ata caa gca tct aaa 1591 
Lys Ala Ala Ala Ser Gin lie Thr Asp Gly Gin lie Gin Ala Ser Lys 
185 190 195 

act acc agt ggc get agt caa gta agt gat ggc caa gtc cag get act 163 9 
Thr Thr Ser Gly Ala Ser Gin Val Ser Asp Gly Gin Val Gin Ala Thr 
200 205 210 

get gaa gtg aaa gac get aac gat cca gtc gat gtt gtt tec tgt aat 1687 
Ala Glu Val Lys Asp Ala Asn Asp Pro Val Asp Val Val Ser Cys Asn 
215 220 225 

aac aat agt acc ttg tea atg agt tta age aag ggt ate tta acc gat 1735 
Asn Asn Ser Thr Leu Ser Met Ser Leu Ser Lys Gly He Leu Thr Asp 
230 235 240 245 

agg aag ggt aga att ggc tct ate gtt gee aac aga cag ttc caa ttc 17 83 
Arg Lys Gly Arg He Gly Ser He Val Ala Asn Arg Gin Phe Gin Phe 
250 255 260 

gat ggt cct cca cca caa get ggt get ate tat get get ggt tgg tec 1831 
Asp Gly Pro Pro Pro Gin Ala Gly Ala He Tyr Ala Ala Gly Trp Ser 
265 270 275 

ate acc cca gaa ggt aac tta get ctt ggt gac cag gat act ttt tac 1879 
He Thr Pro Glu Gly Asn Leu Ala Leu Gly Asp Gin Asp Thr Phe Tyr 
280 285 290 

caa tgt ttg tct ggt gac ttc tat aac ttg tat gat aag cac att ggt 1927 
Gin Cys Leu Ser Gly Asp Phe Tyr Asn Leu Tyr Asp Lys His He Gly 
295 300 305 

tct cag tgc cat gaa gtt tat ttg caa get ata gat tta att gac tgt 1975 
Ser Gin Cys His Glu Val Tyr Leu Gin Ala He Asp Leu He Asp Cys 
310 315 320 325 

tgaacgatgc atcgatcaat eggagtegtc ctcctttaac ttcacgaatt agttgccact 2035 



38 

BNSDOCID: <WO 0058520A1_I_> 



WO 00/58520 PCT/US00/08555 
ctcattcccc acacataaac ttgttttatg gcatcctttt catttagcat gtctttattt 2095 
ccaaaccttt cctcgttctt tgcattcatt tagcgtttgc tcgagaaagc atcacgtttt 2155 
cacacattat cgttcgtcgc tataataaaa atagttatag aatttactca gatttacatg 2215 
tcgtaccttt ttaattgtaa aaaaaaaaat tttatgatac ataattacct aaatataatt 2275 
cagaatcaaa catacttata gctatttgta tgctattagg tggtcctgct ataaaaatat 2335 
cgtttataat actttatatt ttatctttca acttagtcgc aattgcagaa gctttccctg 2395 
agaaaaaatt tgtgaagcta gctgcgatag caaaggagcg cttaaggtat agaaaagcac 2455 
tcagctggaa tgccaaaaga tagtttagca actgaccaag gaaaaagctt gtaggtagac 2515 
ttaacttcat tgttctctaa tcctttcgtc gtgtatattg taaaaactgc tgaacgagta 2575 
ttgataaaag atatcttggc cactaagggg cagatcccct tctggtgtga tagacaaccc 2635 
caggagcata gataacacca acttgtggtg gagggtcatc gaattggaat tgtctgttgg 2695 
caacagtaga acagatgctg cccttgctat ctgtcaaaat gccgctcttc aaagttactt 2755 
tcaaagcgct gtcactgcta cgacagcttt ttttttagaa acagcagcaa tgccttgaca 2 815 
tgtaacgtaa gaaaagaaaa aagagatggc agaagaaata ctaagcgata acggcaatgt 2 875 
agaggtgctt tttttatcgg aataaataga gaagtcagta acagtgattg ctgtggctcc 2935 
ctctttaatc gtatctatgt aggttccgat taaagtggtc gtg 2978 



<210> 16 
<211> 325 
<212> PRT 

<213> Saccharomyces cerevisiae 



<400> 16 

Met Gin Tyr Lys Lys Pro Leu Val Val Ser Ala Leu Ala Ala Thr Ser 
1 



5 10 15 



Leu Ala Ala Tyr Ala Pro Lys Asp Pro Trp Ser Thr Leu Thr Pro Ser 
20 25 30 

Ala Thr Tyr Lys Gly Gly He Thr Asp Tyr Ser Ser Ser Phe Gly He 
35 40 45 
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Ala lie Glu Ala Val Ala Thr Ser Ala Ser Ser Val Ala Ser Ser Lys 
50 55 60 

Ala Lys Arg Ala Ala Ser Gin lie Gly Asp Gly Gin Val Gin Ala Ala 
65 70 75 80 

Thr Thr Thr Ala Ala Val Ser Lys Lys Ser Thr Ala Ala Ala Val Ser 

85 90 95 

Gin lie Thr Asp Gly Gin Val Gin Ala Ala Lys Ser Thr Ala Ala Ala 
100 105 110 

Val Ser Gin lie Thr Asp Gly Gin Val Gin Ala Ala Lys Ser Thr Ala 
115 120 125 

Ala Ala Val Ser Gin lie Thr Asp Gly Gin Val Gin Ala Ala Lys Ser 
130 135 140 

Thr Ala Ala Ala Val Ser Gin lie Thr Asp Gly Gin Val Gin Ala Ala 
145 150 155 160 

Lys Ser Thr Ala Ala Ala Ala Ser Gin lie Ser Asp Gly Gin Val Gin 
165 170 175 

Ala Thr Thr Ser Thr Lys Ala Ala Ala Ser Gin lie Thr Asp Gly Gin 
180 185 190 

lie Gin Ala Ser Lys Thr Thr Ser Gly Ala Ser Gin Val Ser Asp Gly 
195 200 205 

Gin Val Gin Ala Thr Ala Glu Val Lys Asp Ala Asn Asp Pro Val Asp 
210 215 220 

Val Val Ser Cys Asn Asn Asn Ser Thr Leu Ser Met Ser Leu Ser Lys 
225 230 235 240 

Gly He Leu Thr Asp Arg Lys Gly Arg He Gly Ser He Val Ala Asn 
245 250 255 

Arg Gin Phe Gin Phe Asp Gly Pro Pro Pro Gin Ala Gly Ala He Tyr 
260 265 270 

Ala Ala Gly Trp Ser He Thr Pro Glu Gly Asn Leu Ala Leu Gly Asp 
275 280 285 

Gin Asp Thr Phe Tyr Gin Cys Leu Ser Gly Asp Phe Tyr Asn Leu Tyr 
290 295 300 



40 



BNSDOCID: <WO 0058520A1J_> 



WO 00/58520 PCT/USOO/08555 

Asp Lys His He Gly Ser Gin Cys His Glu Val Tyr Leu Gin Ala He 
305 310 315 320 

Asp Leu He Asp Cys 
325 



<210> 17 
<211> 4034 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) - - (3031) 
<400> 17 

ccaaccacgt aagggaaaag gacggtgttt gggccattat ggcgtggttg aacatcttgg 60 
ccatttacaa caagcatcat ccggagaacg aagcttctat taagacgata cagaatgaat 120 
tctgggcaaa gtacggccgt actttcttca ctcgttatga ttttgaaaaa gttgaaacag 18 0 
aaaaagctaa caagattgtc gatcaattga gagcatatgt taccaaatcg ggtgttgtta 240 
attccgcctt cccagccgat gagtctctta aggtcaccga ttgtggtgat ttttcataca 300 
cagatttgga cggttctgtt tctgaccatc aaggtttata tgtcaagctt tccaatggtg 360 
caagattcgt tctaagattg tcaggtacag gttcttcagg tgctaccatt agattgtaca 420 
ttgaaaaata ctgcgatgat aaatcacaat accaaaagac agctgaagaa tacttgaagc 4 80 
caattattaa ctcggtcatc aagttcttga actttaaaca agttttagga actgaagaac 540 
caacggttcg tacttaaaac gaatgattta ctaatggctt aatgattttc acctttttca 600 
atgaatatta acggtaaaga agaaaatttc aattttttga acacatactt tatatactta 660 
atagatccat atttcgacat attagcaaac gattgcatag gtttctgagt cttttttttt 720 
tttttttcat aaggaggaga atattttggt taatcgcagt atcttcttca taagtgctgt 780 
ttctaattat atctaattca cgaatttttc ccaaattagc gtatccccga attcagatta 840 
cctaccccga gttttttatt atatttccct cgagaaatct gtaaaatggc cgtcatcctt 900 
agatttataa ataaaatgat aaaattcagc caaagtgctc ctaaaccaga attgttcaac 960 
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tgggtcaaat tatcgcgtat acaaatatac atatagtaac atg cat tec tgg cga 1015 

Met His Ser Trp Arg 
1 5 

ata tec aag ttt aag tta gga agg tec aaa gaa gat gat ggg agt agt 1063 
lie Ser Lys Phe Lys Leu Gly Arg Ser Lys Glu Asp Asp Gly Ser Ser 
10 15 20 

gaa gat gaa aat gaa aaa teg tgg ggt aat ggc ctg ttt cat ttc cac 1111 
Glu Asp Glu Asn Glu Lys Ser Trp Gly Asn Gly Leu Phe His Phe His 
25 30 35 

cat gga gaa aaa cat cac gat ggt age ccg aag aat cat aat cat gaa 1159 
His Gly Glu Lys His His Asp Gly Ser Pro Lys Asn His Asn His Glu 
40 45 50 

cac gaa cac cat ata aga aag ate aat aca aat gag act etc cca agt 1207 
His Glu His His lie Arg Lys lie Asn Thr Asn Glu Thr Leu Pro Ser 
55 60 65 

tec tta agt tct cca aaa tta cgt aat gat gca tec ttc aag aat cca 1255 
Ser Leu Ser Ser Pro Lys Leu Arg Asn Asp Ala Ser Phe Lys Asn Pro 
70 75 80 85 

teg ggg ata gga aat gac aat tct aag get tec gaa agg aaa get agt 13 03 
Ser Gly lie Gly Asn Asp Asn Ser Lys Ala Ser Glu Arg Lys Ala Ser 
90 95 100 

cag teg tct act gag acg cag gga ccg agt teg gaa tec gga eta atg 1351 
Gin Ser Ser Thr Glu Thr Gin Gly Pro Ser Ser Glu Ser Gly Leu Met 
105 110 115 

aca gtg aag gtg tat tct ggt aaa gat ttt act ctt ccc ttc cct ate 13 99 
Thr Val Lys Val Tyr Ser Gly Lys Asp Phe Thr Leu Pro Phe Pro lie 
120 125 130 

ace tct aac tct act att tta caa aaa eta eta agt tec ggc ate ctt 1447 
Thr Ser Asn Ser Thr lie Leu Gin Lys Leu Leu Ser Ser Gly lie Leu 
135 140 145 

act tea tea tec aat gac get tec gaa gtt gca gee ata atg egg cag 14 95 
Thr Ser Ser Ser Asn Asp Ala Ser Glu Val Ala Ala lie Met Arg Gin 
150 155 160 165 

eta cca cga tac aag aga gtg gat caa gat tea gca ggg gaa ggc ttg 1543 
Leu Pro Arg Tyr Lys Arg Val Asp Gin Asp Ser Ala Gly Glu Gly Leu 
170 175 180 
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ata gat aga get ttt gec act aaa ttc att cct tec tct ata ttg tta 1591 
lie Asp Arg Ala Phe Ala Thr Lys Phe lie Pro Ser Ser lie Leu Leu 
185 190 195 

cct ggg tea aca aat tea age cca tta ctt tat ttt aca att gaa ttt 
Pro Gly Ser Thr Asn Ser Ser Pro Leu Leu Tyr Phe Thr He Glu Phe 
200 205 210 

gat aat tct att act act att agt cca gat atg gga acg atg gag caa 
Asp Asn Ser He Thr Thr He Ser Pro Asp Met Gly Thr Met Glu Gin 
215 220 225 

cca gtg ttt aac aaa ata teg aca ttt gat gta aca aga aaa tta cga 
Pro Val Phe Asn Lys He Ser Thr Phe Asp Val Thr Arg Lys Leu Arg 
230 235 240 245 

ttt tta aaa ate gat gtc ttt gca agg att cca tec eta ctt tta ccc 
Phe Leu Lys He Asp Val Phe Ala Arg He Pro Ser Leu Leu Leu Pro 
250 255 260 

tct aaa aac tgg caa cag gag att ggc gag cag gac gaa gta ctg aag 
Ser Lys Asn Trp Gin Gin Glu He Gly Glu Gin Asp Glu Val Leu Lys 
265 270 275 

gag att tta aaa aaa ate aat aca aat cag gat ate cat ttg gac tec 
Glu He Leu Lys Lys He Asn Thr Asn Gin Asp He His Leu Asp Ser 
280 285 290 

ttc cat tta cct ttg aat tta aaa ate gat tct gca gee caa ata aga 
Phe His Leu Pro Leu Asn Leu Lys He Asp Ser Ala Ala Gin He Arg 
295 300 305 

eta tac aat cac cat tgg att tct tta gaa agg gga tat ggt aaa tta 
Leu Tyr Asn His His Trp He Ser Leu Glu Arg Gly Tyr Gly Lys Leu 
310 315 320 325 

aat ate acg gtg gac tac aaa cct tct aag aac aag cct etc tec att 
Asn He Thr Val Asp Tyr Lys Pro Ser Lys Asn Lys Pro Leu Ser He 
330 335 340 

gat gac ttt gat eta ttg aag gtt ate ggg aag ggt teg ttc ggc aaa 
Asp Asp Phe Asp Leu Leu Lys Val He Gly Lys Gly Ser Phe Gly Lys 
345 350 355 

gtg atg caa gta agg aaa aaa gat ace caa aag att tac get ttg aag 2119 
Val Met Gin Val Arg Lys Lys Asp Thr Gin Lys He Tyr Ala Leu Lys 
360 365 370 
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get ctg aga aaa gca tat att gta teg aaa tgt gaa gtg aca cat act 2167 
Ala Leu Arg Lys Ala Tyr lie Val Ser Lys Cys Glu Val Thr His Thr 
375 380 385 

tta gcg gag agg act gtc eta gca aga gtt gac tgc ccc ttt att gtt 
Leu Ala Glu Arg Thr Val Leu Ala Arg Val Asp Cys Pro Phe lie Val 
390 395 400 405 

ccg ttg aag ttc tea ttc caa tct ccg gag aag ttg tac eta gta tta 
Pro Leu Lys Phe Ser Phe Gin Ser Pro Glu Lys Leu Tyr Leu Val Leu 
410 415 420 

get ttc att aat ggc ggt gaa ctg ttc tac cat tta caa cac gag gga 
Ala Phe lie Asn Gly Gly Glu Leu Phe Tyr His Leu Gin His Glu Gly 
425 430 435 

cga ttc agt eta gca cgc tec cgt ttt tat att gca gaa eta tta tgt 
Arg Phe Ser Leu Ala Arg Ser Arg Phe Tyr lie Ala Glu Leu Leu Cys 
440 445 450 

get etc gat tea tta cac aaa ctt gac gtc att tat cgt gac eta aag 
Ala Leu Asp Ser Leu His Lys Leu Asp Val lie Tyr Arg Asp Leu Lys 
455 460 465 

cct gaa aac att eta ttg gat tac caa gga cat att gca ctg tgt gat 
Pro Glu Asn lie Leu Leu Asp Tyr Gin Gly His lie Ala Leu Cys Asp 
470 475 480 485 

ttt ggg ctt tgc aag ctg aac atg aag gat aat gac aaa aca gac act 
Phe Gly Leu Cys Lys Leu Asn Met Lys Asp Asn Asp Lys Thr Asp Thr 
490 495 500 

ttc tgt ggt act ccc gaa tat ttg gca cca gaa ate ttg ttg ggg cag 2551 
Phe Cys Gly Thr Pro Glu Tyr Leu Ala Pro Glu lie Leu Leu Gly Gin 
505 510 515 

ggc tat act aaa aca gtt gac tgg tgg aca tta ggt ate tta ctg tat 2599 
Gly Tyr Thr Lys Thr Val Asp Trp Trp Thr Leu Gly lie Leu Leu Tyr 
520 525 530 

gag atg atg aca ggg ctg cca cca tac tat gat gag aac gtt cct gtt 2647 
Glu Met Met Thr Gly Leu Pro Pro Tyr Tyr Asp Glu Asn Val Pro Val 
535 540 545 



2215 



2263 



2311 



2359 



2407 



2455 



2503 



atg tac aag aaa att ctg cag caa ccg eta eta ttt cct gat gga ttt 
Met Tyr Lys Lys lie Leu Gin Gin Pro Leu Leu Phe Pro Asp Gly Phe 
550 555 560 565 



2695 
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gac cct gcg gca aaa gac eta tta att ggc etc tta age aga gac cca 2743 
Asp Pro Ala Ala Lys Asp Leu Leu He Gly Leu Leu Ser Arg Asp Pro 
570 575 580 

age aga aga etc ggc gtt aac ggt aca gat gaa att cgt aac cat cct 2791 
Ser Arg Arg Leu Gly Val Asn Gly Thr Asp Glu He Arg Asn His Pro 
585 590 595 

ttc ttt aaa gac ate tea tgg aaa aag eta ctt ttg aag ggc tat att 2 83 9 
Phe Phe Lys Asp He Ser Trp Lys Lys Leu Leu Leu Lys Gly Tyr He 
600 605 610 

ccg cct tac aag cca att gta aag agt gaa ata gat act gca aat ttt 2 887 
Pro Pro Tyr Lys Pro He Val Lys Ser Glu He Asp Thr Ala Asn Phe 
615 620 625 

gat caa gag ttc act aag gaa aaa ccg ate gat agt gta gtg gac gag 2 935 
Asp Gin Glu Phe Thr Lys Glu Lys Pro He Asp Ser Val Val Asp Glu 
630 635 640 645 

tac tta agt gca agt att caa aag cag ttt ggt ggg tgg acg tac att 2 983 
Tyr Leu Ser Ala Ser He Gin Lys Gin Phe Gly Gly Trp Thr Tyr He 
650 655 660 

ggt gac gaa cag ttg ggt gat tct cct teg cag ggg aga age att agt 3 031 
Gly Asp Glu Gin Leu Gly Asp Ser Pro Ser Gin Gly Arg Ser He Ser 
665 670 675 

tagaagcaag ccgaagcaag ccgagccgag ccggacggaa tttatagcta tagcegcaag 3091 

aggttgcaat tttcaaaaat ggatagttca agtagattgc gataegcact ccgttactat 3151 

tgtggttaac ggggacaaga agaactacag aaaatagaat ggtcegcaga ggctgcgctc 3211 

ttcttttagc aactctcaca cgacttatgt tgettattea tttcttttac agcattatca 3271 

gaattcttcc atetaeggaa ttgagatcaa agaccgacct gttgtcggcc gaaggacgga 3331 

cgcttatacc cgcggatgtc aaagegaage ccgcggggcg caagtcgagg ttaccggaat 3391 

tcgccaaacg gcaaaggacc ettgeattge ctgaaaggaa agattcgett ttctgtttgt 3451 

tgccactttt cttacatagt ctgggccggg agcagcttat ttcttccgcg gatgatcctg 3511 

gatttccttg cgcgggctca gccatgggga gccttaccta gtcccgtaaa gggaaaaagc 3571 

taacctcatt cgcctcacag ggtgaaagcg tgaacaaaaa aaaaagaaaa gcttaatgat 3631 
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taaaatttac agtatatata tatttgtatt tacgtattaa actatatata aatagatatg 3691 
tatgccgaaa aagtaaagtc tgggtgatgc ctagtccaat ctttcttact actgtccagt 3751 
ttctatcgta gcagttaatt atacatagaa ctgtgtaaat tcaacgcatt aatttttttt 3 811 
tttttcactt tcgcagttag gggggacaca ttttttttgc cctttcttaa gcttcgtaag 3 871 
cgagttacat cattatttct tcctgggata caatacgcgt tcgtacaagt cacagctgga 3 931 
ccgtataggg aacaagactg caactctctc caacttgtta aacagaggag gaaaagaaag 3991 
agggaaaaga ggaacaaaga caatcaaaga aaaagaatag aaa 4034 



<210> 18 
<211> 677 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 18 

Met His Ser Trp Arg lie Ser Lys Phe Lys Leu Gly Arg Ser Lys Glu 
15 10 15 

Asp Asp Gly Ser Ser Glu Asp Glu Asn Glu Lys Ser Trp Gly Asn Gly 
20 25 30 

Leu Phe His Phe His His Gly Glu Lys His His Asp Gly Ser Pro Lys 
35 40 45 

Asn His Asn His Glu His Glu His His lie Arg Lys He Asn Thr Asn 
50 55 60 

Glu Thr Leu Pro Ser Ser Leu Ser Ser Pro Lys Leu Arg Asn Asp Ala 
65 70 75 80 

Ser Phe Lys Asn Pro Ser Gly He Gly Asn Asp Asn Ser Lys Ala Ser 

85 90 95 

Glu Arg Lys Ala Ser Gin Ser Ser Thr Glu Thr Gin Gly Pro Ser Ser 
100 105 110 

Glu Ser Gly Leu Met Thr Val Lys Val Tyr Ser Gly Lys Asp Phe Thr 
115 120 125 

Leu Pro Phe Pro He Thr Ser Asn Ser Thr He Leu Gin Lys Leu Leu 
130 135 140 

46 

BNSDOCID: <WO 0058520A1J_> 



WO 00/58520 PCT/USOO/08555 



Ser Ser Gly He Leu Thr Ser Ser Ser Asn Asp Ala Ser Glu Val Ala 
145 150 155 160 

Ala He Met Arg Gin Leu Pro Arg Tyr Lys Arg Val Asp Gin Asp Ser 
165 170 175 

Ala Gly Glu Gly Leu He Asp Arg Ala Phe Ala Thr Lys Phe He Pro 
180 185 190 

Ser Ser He Leu Leu Pro Gly Ser Thr Asn Ser Ser Pro Leu Leu Tyr 
195 200 205 

Phe Thr He Glu Phe Asp Asn Ser He Thr Thr He Ser Pro Asp Met 
210 215 220 

Gly Thr Met Glu Gin Pro Val Phe Asn Lys He Ser Thr Phe Asp Val 
225 230 235 240 

Thr Arg Lys Leu Arg Phe Leu Lys He Asp Val Phe Ala Arg He Pro 
245 250 255 

Ser Leu Leu Leu Pro Ser Lys Asn Trp Gin Gin Glu He Gly Glu Gin 
260 265 270 

Asp Glu Val Leu Lys Glu He Leu Lys Lys He Asn Thr Asn Gin Asp 
275 280 285 

He His Leu Asp Ser Phe His Leu Pro Leu Asn Leu Lys He Asp Ser 
290 295 300 

Ala Ala Gin He Arg Leu Tyr Asn His His Trp He Ser Leu Glu Arg 
305 310 315 320 

Gly Tyr Gly Lys Leu Asn He Thr Val Asp Tyr Lys Pro Ser Lys Asn 
325 330 335 

Lys Pro Leu Ser He Asp Asp Phe Asp Leu Leu Lys Val He Gly Lys 
340 345 350 

Gly Ser Phe Gly Lys Val Met Gin Val Arg Lys Lys Asp Thr Gin Lys 
355 360 365 

He Tyr Ala Leu Lys Ala Leu Arg Lys Ala Tyr He Val Ser Lys Cys 
370 375 380 

Glu Val Thr His Thr Leu Ala Glu Arg Thr Val Leu Ala Arg val Asp 
385 390 395 400 
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Cys Pro Phe He Val Pro Leu Lys Phe Ser Phe Gin Ser Pro Glu Lys 
405 410 415 

Leu Tyr Leu Val Leu Ala Phe lie Asn Gly Gly Glu Leu Phe Tyr His 
420 425 430 

Leu Gin His Glu Gly Arg Phe Ser Leu Ala Arg Ser Arg Phe Tyr lie 
435 440 445 

Ala Glu Leu Leu Cys Ala Leu Asp Ser Leu His Lys Leu Asp Val lie 
450 455 460 

Tyr Arg Asp Leu Lys Pro Glu Asn He Leu Leu Asp Tyr Gin Gly His 
465 470 475 480 

lie Ala Leu Cys Asp Phe Gly Leu Cys Lys Leu Asn Met Lys Asp Asn 
485 490 495 

Asp Lys Thr Asp Thr Phe Cys Gly Thr Pro Glu Tyr Leu Ala Pro Glu 
500 505 510 

He Leu Leu Gly Gin Gly Tyr Thr Lys Thr Val Asp Trp Trp Thr Leu 
515 520 525 

Gly He Leu Leu Tyr Glu Met Met Thr Gly Leu Pro Pro Tyr Tyr Asp 
530 535 540 

Glu Asn Val Pro Val Met Tyr Lys Lys He Leu Gin Gin Pro Leu Leu 
545 550 555 560 

Phe Pro Asp Gly Phe Asp Pro Ala Ala Lys Asp Leu Leu He Gly Leu 
565 570 575 

Leu Ser Arg Asp Pro Ser Arg Arg Leu Gly Val Asn Gly Thr Asp Glu 
580 585 590 

He Arg Asn His Pro Phe Phe Lys Asp He Ser Trp Lys Lys ^»eu Leu 
595 600 605 

Leu Lys Gly Tyr He Pro Pro Tyr Lys Pro He Val Lys Ser Glu He 
610 615 620 

Asp Thr Ala Asn Phe Asp Gin Glu Phe Thr Lys Glu Lys Pro He Asp 
625 630 635 640 

Ser Val Val Asp Glu Tyr Leu Ser Ala Ser He Gin Lys Gin Phe Gly 
645 650 655 
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Gly Trp Thr Tyr He Gly Asp Glu Gin Leu Gly Asp Ser Pro Ser Gin 
660 665 670 

Gly Arg Ser He Ser 
675 



<210> 19 
<211> 2765 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (1762) 
<400> 19 

ggatatgatt gctgagaatg cgttaccggc caaaacaaag acagcgggat tgagaaaatt 60 
aaagaaggaa gatattgacc aagtttttga gttgttcaaa agatatcaat ccaggttcga 120 
actaattcaa attttcacaa aagaagaatt cgaacataat ttcattggtg aagaatcgtt 18 0 
accattggat aaacaagtaa ttttctcata tgtagtcgaa cagcccgatg gaaaaattac 24 0 
agacttcttc tcattttact cattgccatt cacaatccta aataacacaa aatataagga 3 00 
cctaggcatc gggtacttgt attattatgc caccgatgca gatttccaat tcaaagacag 360 
gtttgatcca aaagctacta aggctttgaa aacaagattg tgtgaattga tttatgacgc 420 
ttgtattttg gccaaaaacg ctaatatgga tgtttttaac gcgttgactt cgcaagataa 4 80 
tacattgttc ttggatgatt tgaagttcgg gcccggtgac gggttcttga acttctattt 54 0 
atttaattat agagcaaagc cgattaccgg tggcttgaat cccgacaata gtaacgacat 600 
taaaaggcgt agcaatgtcg gtgttgttat gttgtagtgg ctgaaaggac gaggcgtata 660 
tagttttcgt gtacatagcc gacagaattt gaccacattt agtttttccg catagtcaat 720 
tgacgaagtg aaaaaataat taatccaatg gctggcttta gagtgtcagc ctccaaaata 7 80 
aatccaaaaa tagacaaaga gaatcactat aattaccgcc ttggagtcca agttggcttg 840 
agaactcgca tttattttta gcgactgagg tagctgaaaa acgcctactt tctcagaagg 900 
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cggtagtgag catatataag tatgtaagaa agatcaactc ttctggacta gatactcacc 960 

gatctagtga aaatataaac aaacccaaca tatatataaa atg aag gcc tgt tec 1015 

Met Lys Ala Cys Ser 
1 5 

ata tta ttt acc acc tta att act eta gcc get get caa aaa gac tct 1063 
He Leu Phe Thr Thr Leu He Thr Leu Ala Ala Ala Gin Lys Asp Ser 
10 15 20 

ggt tec tta gat ggc cag aac tct gaa gat age tea caa aag gaa age 1111 
Gly Ser Leu Asp Gly Gin Asn Ser Glu Asp Ser Ser Gin Lys Glu Ser 
25 30 35 

tea aac tct caa gag ate aca cct acc acg aca aag gaa gcc caa gaa 1159 
Ser Asn Ser Gin Glu He Thr Pro Thr Thr Thr Lys Glu Ala Gin Glu 
40 45 50 

age gca tea act gta gtt tct acc gga aaa age tta gta caa act age 1207 
Ser Ala Ser Thr Val Val Ser Thr Gly Lys Ser Leu Val Gin Thr Ser 
55 60 65 

aac gtc gtc age aac acc tat get gtg get cca agt acc acc gta gtg 
Asn Val Val Ser Asn Thr Tyr Ala Val Ala Pro Ser Thr Thr Val Val 
70 75 80 85 

acg acg gat gca caa ggc aaa acc acg aca cag tac eta tgg tgg gtg 
Thr Thr Asp Ala Gin Gly Lys Thr Thr Thr Gin Tyr Leu Trp Trp Val 
90 95 100 

gcc gaa age aac tct gcc gta age aca act tea act gcc tct gtg cag 1351 
Ala Glu Ser Asn Ser Ala Val Ser Thr Thr Ser Thr Ala Ser Val Gin 
105 HO 115 



ccc acc gga gag acg tea age gga ate acc aac tec gca tec tec tea 
Pro Thr Gly Glu Thr Ser Ser Gly lie Thr Asn Ser Ala Ser Ser Ser 
120 125 130 

acg aca tea aca tea acg gac ggg cca gtt act ata gta act acc acg 
Thr Thr Ser Thr Ser Thr Asp Gly Pro Val Thr He Val Thr Thr Thr 
135 140 145 

aat teg tta ggt gag act tac aca tct act gtt tgg tgg eta ccg tec 
Asn Ser Leu Gly Glu Thr Tyr Thr Ser Thr Val Trp Trp Leu Pro Ser 
150 155 160 165 

tea gcc aca act gac aac acg get tea tea agt aaa tea tct teg gga 
Ser Ala Thr Thr Asp Asn Thr Ala Ser Ser Ser Lys Ser Ser Ser Gly 



1255 



1303 



1399 



1447 



1495 



1543 
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170 175 180 

tec tea tea aaa ccg gaa tea age ace aag gta gta age act ate aaa 1591 
Ser Ser Ser Lys Pro Glu Ser Ser Thr Lys Val Val Ser Thr lie Lys 
185 190 195 

tea act tat acc act acg tea ggt tct aca gta gag aca ctg ace act 1639 
Ser Thr Tyr Thr Thr Thr Ser Gly Ser Thr Val Glu Thr Leu Thr Thr 
200 205 210 

aca tac aag tct aca gtc aac ggt aag gta gcg tec gta atg tec aat 1687 
Thr Tyr Lys Ser Thr Val Asn Gly Lys Val Ala Ser Val Met Ser Asn 
215 220 225 



tct acc aat ggc gec ttt gec ggc act cac ata get tat ggt gcg ggt 1735 
Ser Thr Asn Gly Ala Phe Ala Gly Thr His lie Ala Tyr Gly Ala Gly 
230 235 240 245 

gca ttc gec gtt ggt gec ctt ttg tta tagaatgtat aatcagttct 1782 
Ala Phe Ala Val Gly Ala Leu Leu Leu 
250 

gtataccacc acatagttct gcattttaat aaaactcttt ctttttatac actgtaggta 1842 
accaataata taactattgt tatcategtg ettgegtatt ttttttcttt cgggtgaaaa 1902 
actccgcagt atttctcget ctccctggat aataagctag aaaaaaaaaa tatatatgac 1962 
agatggatga gtaatcatat tcaataagta ttgtctggct tetgagaegg eggtaagata 2022 
tccttaagag ttgcaatggt ccttttacac aaaagcacac atatatttcc taccgatttt 2082 
gcctctgttt cacgcgcctt ttttaataga taccccaatc catactcccc ccatgtacta 2142 
tccatagaca caatatcaag gaacgttgat caagaaggaa atttgegcac aacgaggctg 2202 
ttgaaaaagt ceggaaaget gcccacatgg gtcaaaccct ttttaagagg tataacagaa 2262 
acatggataa tcgaagtttc cgtagtgaac cccgctaact ccacaatgaa aacttacact 2322 
aggaatctgg atcacactgg aatcatgaag gttgaagaat atactaccta tcaatttgac 23 82 
agtgctacaa gtagtacgat agcagacagc cgggtgaagt tttcaagtgg cttcaatatg 2442 
ggtatcaaat ctaaggtaga ggattggtcg cgcactaaat ttgacgaaaa cgttaagaaa 2502 
agcagaatgg gcatggcatt tgttatccaa aaactcgaag aggegagaaa tcctcagttt 2562 
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tgatgttccc atttaaagat ctttaaagat atcaccatgg gcgagcgaaa ttgagaaaac 2622 
tagtgcagct cgcatttggt cacgtcctaa aaattgtaaa taagcgctgg ttcaacaaaa 2682 
tttaatatac acacatatat aattatttat ttataacagt cattctgcta aactatacat 2742 
caaatgtcac taatcttgat att 2765 



<210> 20 
<211> 254 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 20 

Met Lys Ala Cys Ser lie Leu Phe Thr Thr Leu lie Thr Leu Ala Ala 
1 5 10 15 

Ala Gin Lys Asp Ser Gly Ser Leu Asp Gly Gin Asn Ser Glu Asp Ser 
20 25 30 

Ser Gin Lys Glu Ser Ser Asn Ser Gin Glu He Thr Pro Thr Thr Thr 
35 40 45 

Lys Glu Ala Gin Glu Ser Ala Ser Thr Val Val Ser Thr Gly Lys Ser 
50 55 60 

Leu Val Gin Thr Ser Asn Val Val Ser Asn Thr Tyr Ala Val Ala Pro 
65 70 75 80 

Ser Thr Thr Val Val Thr Thr Asp Ala Gin Gly Lys Thr Thr Thr Gin 
85 90 95 

Tyr Leu Trp Trp Val Ala Glu Ser Asn Ser Ala Val Ser Thr Thr Ser 
100 105 HO 

Thr Ala Ser Val Gin Pro Thr Gly Glu Thr Ser Ser Gly He Thr Asn 
115 120 125 

Ser Ala Ser Ser Ser Thr Thr Ser Thr Ser Thr Asp Gly Pro Val Thr 
130 135 140 

He Val Thr Thr Thr Asn Ser Leu Gly Glu Thr Tyr Thr Ser Thr Val 
145 150 155 160 

Trp Trp Leu Pro Ser Ser Ala Thr Thr Asp Asn Thr Ala Ser Ser Ser 
165 170 175 
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Lys Ser Ser Ser Gly Ser Ser Ser Lys Pro Glu Ser Ser Thr Lys Val 
180 185 190 

Val Ser Thr lie Lys Ser Thr Tyr Thr Thr Thr Ser Gly Ser Thr Val 
195 200 205 

Glu Thr Leu Thr Thr Thr Tyr Lys Ser Thr Val Asn Gly Lys Val Ala 
210 215 220 

Ser Val Met Ser Asn Ser Thr Asn Gly Ala Phe Ala Gly Thr His lie 
225 230 235 240 

Ala Tyr Gly Ala Gly Ala Phe Ala Val Gly Ala Leu Leu Leu 
245 250 



<210> 21 
<211> 3335 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (2332) 
<400> 21 

tcttgttctt tacagattca agaggaaacc aaaaaaaaat caaagaaaaa gaatcgaatt 60 
tttcccaaaa tgaaagtgta aggaaaaaaa aagaggagat agaaaatccg aagaacccca 12 0 
agggacggac aaacacaaga cgatgctgca cgtggttagt tttgtaagcg caggttacga 18 0 
taaagagcat aaacaaatca ttactaagag cggtatacaa gaataaagtg acaaacagtt 240 
ctccctattt aacgcactta acgtaggttc catcattatg atgctattgc cacatcaaat 3 00 
ctcctttgga ctgaacccgc attagtaatt gcccgctttt cttttcttcc gcgggtgggc 360 
cccataaata gaaaaaaaaa gaaagaaagc gtttaaataa atagagtgag cggatttcta 420 
ttatctgaaa accgggttat aatgcacgtg atatgcacgt gggagctggg cggctatttt 4 80 
tttctttttc aaatgtattt gagtcgttta aaatagcact ccccgttgac ccgccaactc 540 
atttttgttt tctctttacg gaaaaggctt taaattaagg cccgcatttt cggtatcctt 600 
gagggaaaaa aaccaaagaa acccaaaaaa gaccacaaag ctgggatatc ttaattagta 660 

53 

: <WO 0OS8S20A1J_> 



WO 00/58520 



PCT/USOO/08555 



gagagggctt ttagttttaa tagtgttacg agtctctaaa aatagcgtag gcacactgcc 720 

ctgattcgga ctttgatcag agtttattac tacaaagagt aatgttgaat gattgggctg 780 

ggttttcata gcattaactc taagtaatat cattcaaccg ctcaaggttc cttacgagca 840 

aacccatata tgctctacag ataaacatat aaatagcgtg catattcttc tctattcaac 900 

tcttgctctg tatagttcaa tagaatctta cagtacatca cgctgcaata gatctaatcc 960 

aagagagaag caaaaaaaaa aagctcgcta taaaaatatc atg caa tta cat tea 1015 

Met Gin Leu His Ser 
1 5 

ctt ate get tea act gcg etc tta ata acg tea get ttg get get act 1063 
Leu He Ala Ser Thr Ala Leu Leu He Thr Ser Ala Leu Ala Ala Thr 
10 15 20 

tec tct tct tec age ata ccc tct tec tgt ace ata age tea cat gee 1111 
Ser Ser Ser Ser Ser He Pro Ser Ser Cys Thr He Ser Ser His Ala 
25 30 35 

acg gec aca get cag agt gac tta gat aaa tat age cgc tgt gat acg 1159 
Thr Ala Thr Ala Gin Ser Asp Leu Asp Lys Tyr Ser Arg Cys Asp Thr 
40 45 50 

tta gtc ggg aac tta act att ggt ggt ggt ttg aag act ggt get ttg 1207 
Leu Val Gly Asn Leu Thr He Gly Gly Gly Leu Lys Thr Gly Ala Leu 
55 60 65 

get aat gtt aaa gaa ate aac ggg tct eta act ata ttt aac get aca 
Ala Asn Val Lys Glu He Asn Gly Ser Leu Thr He Phe Asn Ala Thr 
70 75 80 85 

aat eta ace tea ttc get get gat tec ttg gag tec ate aca gat tct 
Asn Leu Thr Ser Phe Ala Ala Asp Ser Leu Glu Ser He Thr Asp Ser 
90 95 100 

ttg aac eta cag agt ttg aca ate ttg act tct get tea ttt ggg tct 
Leu Asn Leu Gin Ser Leu Thr He Leu Thr Ser Ala Ser Phe Gly Ser 
105 HO 115 

tta cag age gtt gat agt ata aaa ctg att act eta ccc gee ate tec 1399 
Leu Gin Ser Val Asp Ser He Lys Leu He Thr Leu Pro Ala He Ser 
120 125 130 



1255 



1303 



1351 



agt ttt act tea aat ate aaa tct get aac aac att tat att tec gac 
Ser Phe Thr Ser Asn He Lys Ser Ala Asn Asn He Tyr He Ser Asp 



1447 
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135 



140 145 



act teg tta caa tct gtc gat gga ttc tea gee ttg aaa aaa gtt aac 
Thr Ser Leu Gin Ser Val Asp Gly Phe Ser Ala Leu Lys Lys Val Asn 
150 



1495 



155 160 165 



gtg ttc aac gtc aat aac aat aag aaa tta acc teg ate aaa tct cca 
Val Phe Asn Val Asn Asn Asn Lys Lys Leu Thr Ser lie Lys Ser Pro 
170 175 180 

gtt gaa aca gtc age gat tct tta caa ttt teg ttc aac ggt aac cag 
Val Glu Thr Val Ser Asp Ser Leu Gin Phe Ser Phe Asn Gly Asn Gin 
185 190 I 95 

act aaa ate acc ttc gat gac ttg gtt tgg gca aac aat ate agt ttg 
Thr Lys lie Thr Phe Asp Asp Leu Val Trp Ala Asn Asn lie Ser Leu 
200 205 210 

acc gat gtc cac tct gtt tec ttc get aac ttg caa aag att aac tct 
Thr Asp Val His Ser Val Ser Phe Ala Asn Leu Gin Lys lie Asn Ser 
215 220 225 

tea ttg ggt ttc ate aac aac tec ate tea agt ttg aat ttc act aag 
Ser Leu Gly Phe lie Asn Asn Ser lie Ser Ser Leu Asn Phe Thr Lys 
230 235 240 245 

eta aac acc att ggc caa acc ttc agt ate gtt tec aat gac tac ttg 
Leu Asn Thr He Gly Gin Thr Phe Ser He Val Ser Asn Asp Tyr Leu 
250 255 260 

aag aac ttg teg ttc tct aat ttg tea acc ata ggt ggt get ctt gtc 
Lys Asn Leu Ser Phe Ser Asn Leu Ser Thr He Gly Gly Ala Leu Val 
265 270 275 

gtt get aac aac act ggt tta caa aaa att ggt ggt etc gac aac eta 
Val Ala Asn Asn Thr Gly Leu Gin Lys He Gly Gly Leu Asp Asn Leu 
280 285 290 

aca acc att ggc ggt act ttg gaa gtt gtt ggt aac ttc acc tec ttg 
Thr Thr He Gly Gly Thr Leu Glu Val Val Gly Asn Phe Thr Ser Leu 
295 300 305 

aac eta gac tct ttg aag tct gtc aag ggt ggc gca gat gtc gaa tea 
Asn Leu Asp Ser Leu Lys Ser Val Lys Gly Gly Ala Asp Val Glu Ser 
310 315 320 325 

aag tea age aat ttc tec tgt aat get ttg aaa get ttg caa aag aaa 
Lys Ser Ser Asn Phe Ser Cys Asn Ala Leu Lys Ala Leu Gin Lys Lys 



1543 



1591 



1639 



1687 



1735 



1783. 



1831 



1879 



1927 



1975 



2023 
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330 335 340 



999 ggt ate aag ggt gaa tct ttt gtc tgc aaa aat ggt gca tea tec 2071 
Gly Gly lie Lys Gly Glu Ser Phe Val Cys Lys Asn Gly Ala Ser Ser 
345 350 355 



aca tct gtt aaa eta teg tec act tec aaa tct caa tea age caa act 2119 
Thr Ser Val Lys Leu Ser Ser .Thr Ser Lys Ser Gin Ser Ser Gin Thr 
360 365 370 



act gee aag gtt tec aag tea tct tct aag gec gag gaa aag aag ttc 2167 
Thr Ala Lys Val Ser Lys Ser Ser Ser Lys Ala Glu Glu Lys Lys Phe 
375 380 385 



act tct ggc gat ate aag get get get tct gee tct agt gtt tct agt 2215 
Thr Ser Gly Asp lie Lys Ala Ala Ala Ser Ala Ser Ser Val Ser Ser 
390 395 400 405 



tct ggc get tec age tct age tct aag agt tec aaa ggc aat gee get 2263 
Ser Gly Ala Ser Ser Ser Ser Ser Lys Ser Ser Lys Gly Asn Ala Ala 
410 415 420 



ate atg gca cca att ggc caa aca ace cct ttg gtc ggt ctt ttg acg 2311 
lie Met Ala Pro He Gly Gin Thr Thr Pro Leu Val Gly Leu Leu Thr 
425 430 435 



gca ate ate atg tct ata atg taatggaatg aagaaatatt cttcattttt 2362 
Ala He He Met Ser He Met 
440 



gataactagt acctgtcatt cacgacatgt gaacaaataa aaacatttat ttaaaaattt 2422 
tatgtattca aatattttcg ggaaagagat aaaagtaacg acacttaaaa atttaaaaaa 24 82 
tcacaatact ttatttactc agtcttttga tcagctccgg cacctccttg ttgttgcttc 2542 
tttgetgage ccgcaacaaa attgtaaatc aataggecta aaagtaacat tttccagttc 2602 
ttttgaaacc aagacacctc cttaacctct tcatcttctt cgaattgtgc agtgcttcca 2662 
tccttattct tactagcttt tttatcagca taagttttgg ttttcttctt taatttggtg 2722 
actggtgcag taggtcccgc ttctggatat ctgactgtag cagtaatagc atcgttagtc 2782 
tegtcatatg ataacgacac ttgtttgacc tcattatctt catctacatc cacaattaaa 2842 
tegtatttta gtggtgtcct cagcttcatg tagctaaaac atggcatatc cagcttacct 2902 



56 

BNSDCCID: <W O 0056520A1 1 > 



WO 00/58520 PCT/US00/08555 
tcaatctggg cattcaaaca gtattctcca gaaacttcaa catcctgtat attaacggtt 2962 
gttactgtaa cattcccatc ggatgtacta tcaatctcaa atgttcctag gggtatagcg 3022 
tctttcgcat catctgaata gcttaattgt aaaatatcag cacagaaaac catgctggcc 3082 
aataaaatca cacgcaacag ccgcacaagc atctttcctt caatgagtat tgtacagttc 3142 
ttgttagata gtgttgaata gtaccacctt gtttttttac tcaaagtgtc ttttatatac 3202 
ttctaattat tcctatattt ggttgggttt ttaagttacc aatgcaaata cagtggttag 3262 
agacccagcg cacgtatgca aaaaaataca gcggaaattt caagtaaaaa tgtagcttca 3322 

3335 

taaaaaagaa gca 



<210> 22 
<211> 444 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 22 

Met Gin Leu His Ser Leu lie Ala Ser Thr Ala Leu Leu lie Thr Ser 
15 10 " 

Ala Leu Ala Ala Thr Ser Ser Ser Ser Ser lie Pro Ser Ser Cys Thr 
20 25 30 

He Ser Ser His Ala Thr Ala Thr Ala Gin Ser Asp Leu Asp Lys Tyr 
35 40 45 

Ser Arg Cys Asp Thr Leu Val Gly Asn Leu Thr He Gly Gly Gly Leu 
50 55 60 

Lys Thr Gly Ala Leu Ala Asn Val Lys Glu He Asn Gly Ser- Leu Thr 
65 70 75 80 

He Phe Asn Ala Thr Asn Leu Thr Ser Phe Ala Ala Asp Ser Leu Glu 
85 90 95 

Ser He Thr Asp Ser Leu Asn Leu Gin Ser Leu Thr He Leu Thr Ser 
100 105 110 

Ala Ser Phe Gly Ser Leu Gin Ser Val Asp Ser He Lys Leu He Thr 
115 120 125 

Leu Pro Ala He Ser Ser Phe Thr Ser Asn He Lys Ser Ala Asn Asn 
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130 135 140 

lie Tyr lie Ser Asp Thr Ser Leu Gin Ser Val Asp Gly Phe Ser Ala 
145 150 155 160 

Leu Lys Lys Val Asn Val Phe Asn Val Asn Asn Asn Lys Lys Leu Thr 
165 170 175 

Ser lie Lys Ser Pro Val Glu Thr Val Ser Asp Ser Leu Gin Phe Ser 
180 185 190 

Phe Asn Gly Asn Gin Thr Lys He Thr Phe Asp Asp Leu Val Trp Ala 
195 200 205 

Asn Asn He Ser Leu Thr Asp Val His Ser Val Ser Phe Ala Asn Leu 
210 215 220 

Gin Lys He Asn Ser Ser Leu Gly Phe He Asn Asn Ser He Ser Ser 
225 230 235 240 

Leu Asn Phe Thr Lys Leu Asn Thr He Gly Gin Thr Phe Ser He Val 
245 250 255 

Ser Asn Asp Tyr Leu Lys Asn Leu Ser Phe Ser Asn Leu Ser Thr He 
260 265 270 

Gly Gly Ala Leu Val Val Ala Asn Asn Thr Gly Leu Gin Lys He Gly 
275 280 285 

Gly Leu Asp Asn Leu Thr Thr He Gly Gly Thr Leu Glu Val Val Gly 
290 295 300 

Asn Phe Thr Ser Leu Asn Leu Asp Ser Leu Lys Ser Val Lys Gly Gly 
305 310 315 320 

Ala Asp Val Glu Ser Lys Ser Ser Asn Phe Ser Cys Asn Ala Leu Lys 
325 330 335 

Ala Leu Gin Lys Lys Gly Gly He Lys Gly Glu Ser Phe Val Cys Lys 
340 345 350 

Asn Gly Ala Ser Ser Thr Ser Val Lys Leu Ser Ser Thr Ser Lys Ser 
355 360 365 

Gin Ser Ser Gin Thr Thr Ala Lys Val Ser Lys Ser Ser Ser Lys Ala 
370 375 380 

Glu Glu Lys Lys Phe Thr Ser Gly Asp He Lys Ala Ala Ala Ser Ala 
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385 390 395 400 

Ser Ser Val Ser Ser Ser Gly Ala Ser Ser Ser Ser Ser Lys Ser Ser 
405 410 415 

Lys Gly Asn Ala Ala He Met Ala Pro He Gly Gin Thr Thr Pro Leu 
420 425 430 

Val Gly Leu Leu Thr Ala lie He Met Ser He Met 
435 440 

<210> 23 
<211> 3107 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (2104) 
<400> 23 

ttgggattcc attttttata aggcgataat attaggtatg tagatatact agaagttctc 60 
ctcgaggatt taggaatcca taaaagggaa tctgcaattc tacacaattc tataaatatt 120 
attatcatca ttttatatgt taatattcat tgatcctatt acattatcaa tccttgcgtt 180 
tcagcttcca ctaatttaga tgactatttt tcatcatttg cgtcatcttc taacaccgta 240 
tatgataata tactagtaac gtaaatacta gttagtagat gatagttgat ttttattcca 300 
acagtattta tgttttgtca ttcttttcta cataatcttg aaactaggta gatctacaat 360 
tgaaaagtaa atactaacat tatttactaa atttaagtta gaaatcggca cgaaaaaaat 420 
ttgacagatt acgagagtcc agccaaaata tgagtatatt actatttccc cttggtgaaa 480 
gaaatgaaag atgttatttt ttaccggctt agtaatactg agctacttac ttgggggaaa 540 
gaaagattgg ctacttatta tgtatgaagc ctcagattac cttgaattcc tcaaccgttt 600 
gagcagtatg ctcttcaaat tcgaactttt tgaacatctt tcctccacat tcctgatttt 660 
ttcacattca aaacgcgctg tgaagctgtt agaaatttac agatcgaggc atatttctat 720 
atataatgta tttttattaa gacacccaaa gtacttccaa tctgtagata ttgcacttta 780 
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tctgaccaga agccagactt gaacagttac 
gaactgtttt cgtatttttt tttcttttcc 
ctaaaaattt ggcaatttaa agaaagcatc 
aaaagtatct tttcttcact tttctttcaa 



atattgtgct ttgcagtcgt taaatttccc 840 

tcttttccac tggatcagat caaaagccga 900 

ttttaaagat agaaaaggtt atttcaacaa 960 

caattcaaag atg get aga acc ata 1015 
Met Ala Arg Thr He 
1 5 



act ttt gat ate cct tec caa tat aaa etc gta gat tta ata ggt gag 1063 
Thr Phe Asp He Pro Ser Gin Tyr Lys Leu Val Asp Leu lie Gly Glu 

10 15 20 



gga gcg tac gga aca gta tgt tea gca att cat aag cct tec ggc ata 1111 
Gly Ala Tyr Gly Thr Val Cys Ser Ala He His Lys Pro Ser Gly He 
25 30 35 



aag gta get ate aag aaa ata caa ccg ttt age aaa aaa ttg ttt gtt 
Lys Val Ala He Lys Lys He Gin Pro Phe Ser Lys Lys Leu Phe Val 
40 45 50 

aca aga act ata cgt gag ate aag ctt tta egg tat ttc cat gaa cac 
Thr Arg Thr He Arg Glu He Lys Leu Leu Arg Tyr Phe His Glu His 
55 60 65 



1159 



1207 



gaa aac ata ata agt ata ttg gat aaa gta agg cca gta tec ata gac 
Glu Asn He He Ser He Leu Asp Lys Val Arg Pro Val Ser He Asp 
70 75 80 85 



aaa eta aac get gtt tat tta gtc gaa gag ttg atg gaa acc gat tta 1303 
Lys Leu Asn Ala Val Tyr Leu Val Glu Glu Leu Met Glu Thr Asp Leu 
90 . 95 100 



caa aaa gta att aat aat cag aat age ggg ttt tec act tta agt gat 1351 
Gin Lys Val He Asn Asn Gin Asn Ser Gly Phe Ser Thr Leu Ser Asp 
105 HO 115 



gac cat gtt caa tac ttt aca tac caa ate etc aga gee tta aag tct 13 99 
Asp His Val Gin Tyr Phe Thr Tyr Gin He Leu Arg Ala Leu Lys Ser 
120 125 130 



att cac agt gca caa gtt ate cat aga gac ata aag cca tea aac ctg 1447 
He His Ser Ala Gin Val He His Arg Asp He Lys Pro Ser Asn Leu 
135 140 145 



tta eta aat tec aat tgt gat etc aaa gtc tgc gat ttt gga eta gcg 1495 
Leu Leu Asn Ser Asn Cys Asp Leu Lys Val Cys Asp Phe Gly Leu Ala 
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150 155 160 165 

agg tgt tta get age agt age gat tea aga gaa aca ttg gta gga ttc 1543 
Arg Cys Leu Ala Ser Ser Ser Asp Ser Arg Glu Thr Leu Val Gly Phe 
170 175 180 

atg acg gag tac gtc gca acg cga tgg tac agg gca ccc gag ata atg 1591 
Met Thr Glu Tyr Val Ala Thr Arg Trp Tyr Arg Ala Pro Glu lie Met 
185 190 195 

eta act ttt caa gag tac aca act gcg atg gat ata tgg tea tgc gga 163 9 
Leu Thr Phe Gin Glu Tyr Thr Thr Ala Met Asp He Trp Ser Cys Gly 
200 205 210 

tgc att ttg get gaa atg gtc tec ggg aag cct ttg ttc cca ggc aga 1687 
Cys He Leu Ala Glu Met Val Ser Gly Lys Pro Leu Phe Pro Gly Arg 
215 220 225 

gac tat cat cat caa tta tgg eta att eta gaa gtc ttg gga act cca 173 5 
Asp Tyr His His Gin Leu Trp Leu He Leu Glu Val Leu Gly Thr Pro 
230 235 240 245 

tct ttc gaa gac ttt aat cag ate aaa tec aag agg get aaa gag tat 1783 
Ser Phe Glu Asp Phe Asn Gin He Lys Ser Lys Arg Ala Lys Glu Tyr 
250 255 260 

ata gca aac tta cct atg agg cca ccc ttg cca tgg gag acc gtc tgg 1831 
He Ala Asn Leu Pro Met Arg Pro Pro Leu Pro Trp Glu Thr Val Trp 
265 270 275 

tea aag acc gat ctg aat cca gat atg ata gat tta eta gac aaa atg 1879 
Ser Lys Thr Asp Leu Asn Pro Asp Met He Asp Leu Leu Asp Lys Met 
280 285 290 

ctt caa ttc aat cct gac aaa aga ata age gca gca gaa get tta aga 1927 
Leu Gin Phe Asn Pro Asp Lys Arg He Ser Ala Ala Glu Ala Leu Arg 
295 300 305 

cac cct tac ctg gca atg tac cat gac cca agt gat gag ccg gaa tat 197 5 
His Pro Tyr Leu Ala Met Tyr His Asp Pro Ser Asp Glu Pro Glu Tyr 
310 315 320 325 

cct cca ctt aat ttg gat gat gaa ttt tgg aaa ctg gat aac aag ata 2023 
Pro Pro Leu Asn Leu Asp Asp Glu Phe Trp Lys Leu Asp Asn Lys He 
330 335 340 

atg cgt ccg gaa gag gag gaa gaa gtg ccc ata gaa atg etc aaa gac 2071 
Met Arg Pro Glu Glu Glu Glu Glu Val Pro He Glu Met Leu Lys Asp 
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345 



350 



355 



atg ctt tac gat gaa eta atg aag acc atg gaa tagtattcac aagaacattt 2124 
Met Leu Tyr Asp Glu Leu Met Lys Thr Met Glu 



ctgccatact tctaaaattt ccctatattc agcttagcag tgacacgttg tggtctgtag 2184 
gtcaatatgt aagtaagaaa cttcaactca catatgeacg atgeatgeca atggaaaaat 2244 
gcaaggaacg aaatggcgcc aeggcaacaa gttttttttt ttcgccagca gaagtacacg 23 04 
aaatgegget teatgagect cttcactgct ttgectaaac gggaaatgea gagaaaaacc 2364 
agccatcgcg tgtgcttgga gagctgaege gactgtaatc aaagaggega tatcaacacc 2424 
ttttatccag cactattcaa cagtgaatgg gctcccaagt aagtcttggc attgtgcttt 24 84 
ctattcttaa gtattaagta gaagttttgt ttactgggtt tgtttattcc tggctagatg 2544 
ttegcatteg ttttctagtt gaccatattt accaaatatt cacaactaat acccagccaa 2604 
ggtagtctaa aagctaattt ctctaaaagg gagaaagttg gtgatttttt atetegcatt 2664 
attatatatg caagaatagt taaggtatag ttataaagtt ttatcttaat tgccacatac 2724 
gtacattgac aegtagaagg actccattat ttttttcatt ctagcatact attattcctt 27 84 
gtaacgtccc agagtattcc atttaattgt cctccatttc ttaacggtga cgaaggatca 2844 
ccatacaaca actactaaag attatagtac actctcacct tgcaactatt tatctgacat 2904 
ttgecttact tttatctcca gcttcccctc gattttattt ttcaatttga tttctaaagc 2964 
tttttgetta ggcataccaa accatccact catttaacac cttatttttt ttttcgaaga 3024 
cagcatccaa etttataegt tcactacctt tttttttaca acaatttcat tcttcatcct 3084 
atgaaatgac gaaaataacc aga 3107 

<210> 24 
<211> 368 
<212> PRT 

<213> Saccharomyces cerevisiae 



360 



365 



<400> 24 

Met Ala Arg Thr lie Thr Phe Asp lie Pro Ser Gin Tyr Lys Leu Val 
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1 5 10 15 

Asp Leu lie Gly Glu Gly Ala Tyr Gly Thr Val Cys Ser Ala lie His 
20 25 30 

Lys Pro Ser Gly He Lys Val Ala He Lys Lys lie Gin Pro Phe Ser 
35 40 45 

Lys Lys Leu Phe Val Thr Arg Thr He Arg Glu He Lys Leu Leu Arg 
50 55 60 

Tyr Phe His Glu His Glu Asn He He Ser He Leu Asp Lys Val Arg 
65 70 75 80 

Pro Val Ser He Asp Lys Leu Asn Ala Val Tyr Leu Val Glu Glu Leu 
85 90 95 

Met Glu Thr Asp Leu Gin Lys Val He Asn Asn Gin Asn Ser Gly Phe 
X00 105 HO 

Ser Thr Leu Ser Asp Asp His Val Gin Tyr Phe Thr Tyr Gin He Leu 
115 120 125 

Arg Ala Leu Lys Ser He His Ser Ala Gin Val He His Arg Asp He 
130 135 140 

Lys Pro Ser Asn Leu Leu Leu Asn Ser Asn Cys Asp Leu Lys Val Cys 
145 150 155 160 

Asp Phe Gly Leu Ala Arg Cys Leu Ala Ser Ser Ser Asp Ser Arg Glu 
165 170 175 

Thr Leu Val Gly Phe Met Thr Glu Tyr Val Ala Thr Arg Trp Tyr Arg 
180 185 190 

Ala Pro Glu He Met Leu Thr Phe Gin Glu Tyr Thr Thr Ala Met Asp 
195 200 205 

He Trp Ser Cys Gly Cys He Leu Ala Glu Met Val Ser Gly Lys Pro 
210 215 220 

Leu Phe Pro Gly Arg Asp Tyr His His Gin Leu Trp Leu He Leu Glu 
225 230 235 240 

Val Leu Gly Thr Pro Ser Phe Glu Asp Phe Asn Gin He Lys Ser Lys 
245 250 255 

Arg Ala Lys Glu Tyr He Ala Asn Leu Pro Met Arg Pro Pro Leu Pro 
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260 265 270 

Trp Glu Thr Val Trp Ser Lys Thr Asp Leu Asn Pro Asp Met lie Asp 
275 280 285 

Leu Leu Asp Lys Met Leu Gin Phe Asn Pro Asp Lys Arg lie Ser Ala 
290 295 300 

Ala Glu Ala Leu Arg His Pro Tyr Leu Ala Met Tyr His Asp Pro Ser 
305 310 315 320 

Asp Glu Pro Glu Tyr Pro Pro Leu Asn Leu Asp Asp Glu Phe Trp Lys 
325 330 335 

Leu Asp Asn Lys lie Met Arg Pro Glu Glu Glu Glu Glu Val Pro lie 
340 345 350 

Glu Met Leu Lys Asp Met Leu Tyr Asp Glu Leu Met Lys Thr Met Glu 
355 360 365 



<210> 25 
<211> 3086 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (2083) 
<400> 25 

aatactgaat agaatcacgc tactacgaca agactcggtt actgtgccta aaataatcct 60 
gtgataaacg agttatgtta aacgcagtac aggggttaaa gggcattgag tttttgtgag 120 
tggaaatgcc cccgttatag cttccagttt aattacaaat tatcaattta agcaaatata 180 
actggaggat tggggaggcg actaaaaatg gctaccacgc tattagacat acaacattga 24 0 
gtattttatg taattttgtt actgctagca cggccatgca attggcaact gaaagctatc 300 
tgacaactta aatgattctt aaaacaatga cgactataat cttctctaag aagtttcata 360 
tccatcttcc tcattattca gtttcttttt cctcttgaaa gtatcgtaaa gaacaacgtc 420 
ttcacattag ctattagaag accattgaac taccggatat gagtaagagt gatcttgccg 480 
gagagataat agctgcacaa aggccaagga ttagattaat gggtgcattg tacgaaaaaa 540 
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aatagtttac agtcatttat tcgcaataaa tcaatttttt tttcaaaaaa tatgtaagtc 600 

tgataaaaaa ttcttcactg aagagagatg cttacattct aattcttgaa taaaagactc 660 

tctaacgctg tgaattctct ttagctgtaa cggaaacaga gagttattcc gtagtcactg 720 

aatttttttt ttttgacgct attatttaaa acctaggata tccgtcccat acaaaacggc 780 

cacgagtttc aatcccagaa tgtacgagtt ataattctcc tagatgcatg atactcgtgc 84 0 

attcgtttaa ca atcatacc aatttcccat tttcgggata ttaaacatga acatactttt 900 

ttactgtgag aatgtggttt cacaattatt ccatacaggt ataaaaacgc acagaacttc 960 

aaacgggaag actatctacc cacattgatg gacaaacgca atg att tct get aat 1015 

Met lie Ser Ala Asn 
1 5 

tea tta ctt att tec act ttg tgc get ttt gcg ate gca aca cct ttg 1063 
Ser Leu Leu He Ser Thr Leu Cys Ala Phe Ala He Ala Thr Pro Leu 
10 15 20 

tea aaa aga gat tec tgc ace eta aca gga tct tct ttg tct tea etc 1111 
Ser Lys Arg Asp Ser Cys Thr Leu Thr Gly Ser Ser Leu Ser Ser Leu 
25 30 35 

tea ace gtg aaa aaa tgt age age ate gtt att aaa gac tta act gtc 1159 
Ser Thr Val Lys Lys Cys Ser Ser He Val He Lys Asp Leu Thr Val 
40 45 50 

cca get gga cag act tta gat tta act ggg tta age agt ggt act act 1207 
Pro Ala Gly Gin Thr Leu Asp Leu Thr Gly Leu Ser Ser Gly Thr Thr 
55 60 65 

gtt acg ttt gaa ggc aca ace aca ttt cag tac aag gaa tgg age ggc 1255 
Val Thr Phe Glu Gly Thr Thr Thr Phe Gin Tyr Lys Glu Trp Ser Gly 
70 75 80 85 

cct tta att tea ate tea ggg tct aaa ate age gtt gtt ggt get teg 1303 
Pro Leu He Ser He Ser Gly Ser Lys He Ser Val Val Gly Ala Ser 
90 95 100 

gga cat ace att gat ggt caa gga gca aaa tgg tgg gat ggc tta ggt 1351 
Gly His Thr He Asp Gly Gin Gly Ala Lys Trp Trp Asp Gly Leu Gly 
105 HO 115 

gat age ggt aaa gtc aaa ccg aag ttt gta aag ttg gcg ttg acg gga 13 99 
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Asp Ser Gly Lys Val Lys Pro Lys Phe Val Lys Leu Ala Leu Thr Gly 
120 125 130 

aca tct aag gtc acc gga ttg aat att aaa aat get cca cac caa gtc 1447 
Thr Ser Lys Val Thr Gly Leu Asn lie Lys Asn Ala Pro His Gin Val 
135 140 145 

ttc age ate aat aaa tgt tea gat tta acc ate age gac ata aca att 1495 
Phe Ser lie Asn Lys Cys Ser Asp Leu Thr lie Ser Asp lie Thr lie 
150 155 160 165 

gat ate aga gac ggt gat teg get ggt ggt cat aat acg gat ggg ttt 1543 
Asp lie Arg Asp Gly Asp Ser Ala Gly Gly His Asn Thr Asp Gly Phe 
170 175 180 

gat gtt ggt agt tct agt aac gtc tta att caa gga tgt act gtt tat 1591 
Asp Val Gly Ser Ser Ser Asn Val Leu lie Gin Gly Cys Thr Val Tyr 
185 190 195 

aat cag gat gac tgt att get gtg aat tec ggt tea act att aaa ttt 1639 
Asn Gin Asp Asp Cys lie Ala Val Asn Ser Gly Ser Thr lie Lys Phe 
200 205 210 

atg aac aac tac tgc tac aat ggc cat ggt att tct gta ggt tct gtt 1687 
Met Asn Asn Tyr Cys Tyr Asn Gly His Gly lie Ser Val Gly Ser Val 
215 220 225 

99^ 99 c cgt tct gat aat aca gtc aat ggt ttc tgg get gaa aat aac 1735 
Gly Gly Arg Ser Asp Asn Thr Val Asn Gly Phe Trp Ala Glu Asn Asn 
230 235 240 245 

cat gtt ate aac tct gac aac ggg ttg aga ata aaa acc gta gaa ggt 1783 
His Val lie Asn Ser Asp Asn Gly Leu Arg lie Lys Thr Val Glu Gly 
250 255 260 

gcg aca ggc aca gtc act aat gtc aac ttt ate agt aat aaa att age 1831 
Ala Thr Gly Thr Val Thr Asn Val Asn Phe lie Ser Asn Lys lie Ser 
265 270 275 

ggc ata aaa agt tat ggt att gtt ate gaa ggc gat tat ttg aat agt 1879 
Gly lie Lys Ser Tyr Gly lie Val lie Glu Gly Asp Tyr Leu Asn Ser 
280 285 290 

aag act act gga act get aca ggt ggc gtt ccc att teg aat tta gta 1927 
Lys Thr Thr Gly Thr Ala Thr Gly Gly Val Pro lie Ser Asn Leu Val 
295 300 305 

atg aag gat ate acc ggg age gtg aac tec aca gcg aag agg gtt aaa 1975 
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Met Lys Asp lie Thr Gly Ser Val Asn Ser Thr Ala Lys Arg Val Lys 
310 315 320 325 

att ttg gtg aaa aac get act aac tgg caa tgg tct ggg gtg tea att 2023 
lie Leu Val Lys Asn Ala Thr Asn Trp Gin Trp Ser Gly Val Ser lie 
330 335 340 

acc ggt ggt tct tec tat tct gga tgt tct gga ate cca tct gga tct 2071 
Thr Gly Gly Ser Ser Tyr Ser Gly Cys Ser Gly lie Pro Ser Gly Ser 
345 350 355 

ggt gca age tgt taatcctctt ttaaagtact catatgacta tacatacctt 2123 
Gly Ala Ser Cys 
360 

cttttctttt ctttactatt caatacataa cagaacaaag atgcaggaaa atattggtat 2183 
ttgttcggca atttatgetg ggtttttttg taaattcagg tctaattatt actgttgatt 2243 
tgtatcaagt tggtatcttt tttgccattt aataatagag ataegctatg ctcatccgga 2303 
tagcaacaat gagagectaa aagtcctaat tgagaagaaa atctctgttc aagactatag 2363 
tttatgtttc attctggacc cttgggatcg tctgaaacag gaaggtcaat aattggtaaa 242 3 
aaaaatggta aatgegacta agtactacaa ttgaaacgaa tgagegcact tcatcttcct 24 83 
acaaaacget gcggctgaaa aagttacata aaaaaccgtc etcaatageg ttaatccagc 2 543 
gtacatgaga aagtaatgac aaagtcttcg gtaatatcag tgcatctacc aatatgacac 2603 
aattgtgaaa cttcgctgac tcaaataata gccctgtttt tttgaccatt gttacccatc 2663 
gagccagtga gaaaaaagee aaaatatctt taaggectte tccattttat gtttatcgat 2723 
attgtgttgt ctgeaatatt gaaattttaa aggctattta ctttgcctct tgttataaac 2783 
taagtctgee gaattatgea atatatagca aaagctgaaa atagatgtaa ttacataatt 2 843 
cgcagttgta tatgagtatc ettaactegt acattccagt tcatctgtga caaggcactg 2903 
ttttccctaa taattattag ggaaaegtec ttcaaaaatc aaaataattt tagagagtct 2 963 
catcaacctt cgccatagtt cgtgatgaaa actttaeggt aegtcagact ttagatattg 3 023 
atttttttat tatttctccc atcgtgagta caattaccct agttcgaact atatctttca 3083 
tta 3086 
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<210> 26 
<211> 361 
<212> PRT 

<213> Saccharomyces- cerevisiae 
<400> 26 

Met lie Ser Ala Asn Ser Leu Leu lie Ser Thr Leu Cys Ala Phe Ala 
1 5 10 15 

lie Ala Thr Pro Leu Ser Lys Arg Asp Ser Cys Thr Leu Thr Gly Ser 
20 25 30 

Ser Leu Ser Ser Leu Ser Thr Val Lys Lys Cys Ser Ser lie Val lie 
35 40 45 

Lys Asp Leu Thr Val Pro Ala Gly Gin Thr Leu Asp Leu Thr Gly Leu 
50 55 60 

Ser Ser Gly Thr Thr Val Thr Phe Glu Gly Thr Thr Thr Phe Gin Tyr 
65 70 75 80 

Lys Glu Trp Ser Gly Pro Leu lie Ser lie Ser Gly Ser Lys lie Ser 
85 90 95 

Val Val Gly Ala Ser Gly His Thr lie Asp Gly Gin Gly Ala Lys Trp 
100 105 110 

Trp Asp Gly Leu Gly Asp Ser Gly Lys Val Lys Pro Lys Phe Val Lys 
115 120 125 

Leu Ala Leu Thr Gly Thr Ser Lys Val Thr Gly Leu Asn lie Lys Asn 
130 135 140 

Ala Pro His Gin Val Phe Ser lie Asn Lys Cys Ser Asp Leu Thr lie 
145 150 155 160 

Ser Asp lie Thr lie Asp lie Arg Asp Gly Asp Ser Ala Gly Gly His 
165 170 175 

Asn Thr Asp Gly Phe Asp Val Gly Ser Ser Ser Asn Val Leu lie Gin 
180 185 190 

Gly Cys Thr Val Tyr Asn Gin Asp Asp Cys lie Ala Val Asn Ser Gly 
195 200 205 

Ser Thr He Lys Phe Met Asn Asn Tyr Cys Tyr Asn Gly His Gly He 
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210 215 _ 220 

Ser Val Gly Ser Val Gly Gly Arg Ser Asp Asn Thr Val Asn Gly Phe 
225 230 235 240 

Trp Ala Glu Asn Asn His Val He Asn Ser Asp Asn Gly Leu Arg He 
245 250 255 

Lys Thr Val Glu Gly Ala Thr Gly Thr Val Thr Asn Val Asn Phe He 
260 265 270 

Ser Asn Lys He Ser Gly He Lys Ser Tyr Gly He Val He Glu Gly 
275 280 285 

Asp Tyr Leu Asn Ser Lys Thr Thr Gly Thr Ala Thr Gly Gly Val Pro 
290 295 300 

He Ser Asn Leu Val Met Lys Asp He Thr Gly Ser Val Asn Ser Thr 
305 310 315 320 

Ala Lys Arg Val Lys He Leu Val Lys Asn Ala Thr Asn Trp Gin Trp 
325 330 335 

Ser Gly Val Ser He Thr Gly Gly Ser Ser Tyr Ser Gly Cys Ser Gly 
340 345 350 

He Pro Ser Gly Ser Gly Ala Ser Cys 
355 360 



<210> 27 
<211> 2486 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) . . (1483) 
<400> 27 

ttctcgagca ttagatgatt aaatcaaaat gacatagtat ttcgcaacct ttcagttggg 60 
ctttgtttaa gaagtggaat acttttgctt gagttgttta gttttatttt atccactgtt 12 0 
gtcttaacaa atattttcaa gaccggtaag ccgaagatga aaaatcatta ttaactcatt 180 
ttttgaacaa aaatataaac aaaagaaagg caacgcacaa ttttagagat acataaaacg 240 
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cagtggatgt taaaaataac agcggtacag aaacgcctgt ctcgctccaa taataattat 3 00 



acaaatttga aaccgaacgc aatgtgccaa gaaatgtaaa cacactatag aaaaaaatag 3 60 

aacggtgcac attgtgctag catatctgct tggttctgaa caagaagcac ctggccactt 420 

tctcctagcc caattcttgc caagttttca acctcaatct tttgtgtttg aacaagcatg 480 

tatgaggggt caaaatttag tggaggccgc ttacaatcct tctatttcct ctggacctca 54 0 

ttagccgtct ggccagacct aagcgtcata atctggagaa tttcattgca tgcgagaata 600 

tgataagtaa gaacttgttt atttatacaa gttccaccca ctcatacacg gctacaatta 660 

tgacgtataa taacgtttcg tctagcccac cttttttact tttgacgttt tatttctttc 720 

gaggatttgg ccaagaatgc cccgaacagc ggaaaaaatg gcgtcgcagt ttcagatgta 780 

tagactcatc ttgtagaaaa aagaatgcaa gaatgaagtc ttttcgtggt gttttgaaaa 84 0 

cactataaac aaaccgtcaa caaacatttt gtataaatat ttagctatat attgaatatc 900 

ttgaccagta aagcaccttg agaaattgta agcttgaaga acgtactttg atatccctcc 960 

gtttcatcat cctatagctc gtcaacaaat caaaaaaaat atg aag ate agt caa 1015 

Met Lys lie Ser Gin 
1 5 

ttt ggc tct tta get ttc gec cca att gtg eta eta caa ctg ttc att 1063 
Phe Gly Ser Leu Ala Phe Ala Pro lie Val Leu Leu Gin Leu Phe lie 
10 15 20 



gtt caa gcg caa ctt etc aca gat tea aat get cag gat ttg aat act 1111 
Val Gin Ala Gin Leu Leu Thr Asp Ser Asn Ala Gin Asp Leu Asn Thr 
25 30 35 

gec ctt gga cag aaa gtg caa tac ace ttt ctt gac act gga aat tct 1159 
Ala Leu Gly Gin Lys Val Gin Tyr Thr Phe Leu Asp Thr Gly Asn Ser 
40 45 50 

aac gat caa eta ctt cat ctt cca age acc acc tct tec age att att 1207 
Asn Asp Gin Leu Leu His Leu Pro Ser Thr Thr Ser Ser Ser lie lie 
55 60 65 



act ggt tea tta get get get aat ttc acc ggt tct tea tea teg teg 1255 
Thr Gly Ser Leu Ala Ala Ala Asn Phe Thr Gly Ser Ser Ser Ser Ser 
70 75 80 85 
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tct ata cca aaa gtc act tec age gtc ata aca tct ata aat tac caa 1303 
Ser lie Pro Lys Val Thr Ser Ser Val lie Thr Ser He Asn Tyr Gin 
90 95 100 

tec tea aat tct acg gta gtc ace cag ttc acg cca ttg cct tct teg 1351 
Ser Ser Asn Ser Thr Val Val Thr Gin Phe Thr Pro Leu Pro Ser Ser 
105 110 115 

teg aga aat gaa aca aaa age tct caa aca act aat act ata agt tea 1399 
Ser Arg Asn Glu Thr Lys Ser Ser Gin Thr Thr Asn Thr He Ser Ser 
120 125 130 

agt aca age aca gga ggt gta ggt tea gtc aag cca tgt ctt tac ttc 1447 
Ser Thr Ser Thr Gly Gly Val Gly Ser Val Lys Pro Cys Leu Tyr Phe 
135 140 145 

gtt tta atg tta gaa aca ate get tat ttg ttt tct taaaeaaata 1493 
Val Leu Met Leu Glu Thr He Ala Tyr Leu Phe Ser 
150 155 160 

tattaggttc aaggtcttcg caggtgtaag aaaacccgtg gtctccatat tcttaagtat 1553 
gataaataaa aaaaaactta ataaattatt aattgettea aacctttttc tttttttagt 1613 
ttttaatatt teaaaegtta tcttcattga acgcccaaat agggaaaaat cctggcaaat 1673 
tttttattgc tgtcatccaa ggctatgeta gaaaattcaa gagcttggat gatttaaaaa 1733 
gacactctca atcgagaaag tttattcttt gttattctgc tttacctgat catattcegg 1793 
cgtattgttt ctaatcaagt gatttcgata tccagttacg aaccatttac aacattcctg 1853 
aaaatattgc gtatcaatga tatttgetec ttctttctcc ctcattaaaa atattctcct 1913 
ggtaagcttt ctaatcagcc acagttttgc tgecaaaact ttaaegtcta gttccaatga 1973 
cgatacactt gccaggtccg cagctgeaga tgcagacatg gcattcttca tggagttttt 2033 
aaacgatttc gacaccgctt ttccacagta tacctcatac atgatgcaaa accatttaac 2093 
cctacctcaa cctgttgctg actactacta tcacatggtt gatttggect caacagcaga 2153 
tttacaatct gatattgetc agagttttcc gttcactcaa ttccaaacat teattaegge 2213 
ctttccatgg tatacctctt tgctaaacaa agcctccgcc accaccatat accttcccca 2273 
acacttcata acaggtgaga cagaagctac catgactaac tcatcttatg ccagccaaaa 2333 
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aaactccgtt tccaattctg ttcctttctc gacagcgaac gcaggccagt ccatgatttc 2393 
catggctaat gaagaaaaca gtacaacagc acttatatcc gcatcaaact cttcttcaac 2453 
atccagaact agtcaatcac agaatggtgc cca 2486 



<210> 28 
<211> 161 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 28 

Met Lys lie Ser Gin Phe Gly Ser Leu Ala Phe Ala Pro lie Val Leu 
15 10 15 

Leu Gin Leu Phe lie Val Gin Ala Gin Leu Leu Thr Asp Ser Asn Ala 
20 25 30 

Gin Asp Leu Asn Thr Ala Leu Gly Gin Lys Val Gin Tyr Thr Phe Leu 
35 40 45 

Asp Thr Gly Asn Ser Asn Asp Gin Leu Leu His Leu Pro Ser Thr Thr 
50 55 60 

Ser Ser Ser lie lie Thr Gly Ser Leu Ala Ala Ala Asn Phe Thr Gly 
65 70 75 80 

Ser Ser Ser Ser Ser Ser He Pro Lys Val Thr Ser Ser Val He Thr 
85 90 95 

Ser He Asn Tyr Gin Ser Ser Asn Ser Thr Val Val Thr Gin Phe Thr 
100 ^ 105 110 

Pro Leu Pro Ser Ser Ser Arg Asn Glu Thr Lys Ser Ser Gin Thr Thr 
115 120 125 

Asn Thr He Ser Ser Ser Thr Ser Thr Gly Gly Val Gly Ser Val Lys 
130 135 140 

Pro Cys Leu Tyr Phe Val Leu Met Leu Glu Thr He Ala Tyr Leu Phe 
145 150 155 160 

Ser 



<210> 29 
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<211> 2783 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
<221> CDS 

<222> (1001) - . (1780) 
<400> 29 

attctttggt tgtgcttata cataattgaa aaagtgctaa aatccttaca cttccaaaac 60 

attgaaagtg gtaattattt tccatctaaa accgttggga gccaccccag aaaaccctta 12 0 

ttctctgcct tcgtgaaaca gctgcttata ttcattgttg ggctgggcgt gatgaagttc 18 0 

tgcgtgtttc taatactaaa ctacttagaa gacttggcat actggttcgc cgatcttatc 24 0 

cttggctggt cagattcatg gccaaacttt caagtttttc tggtcatgtt tgtctttcct 30.0 

atcttactga attgcttcca gtacttttgt gtcgacaatg tcatcaggtt acattctgag 360 

agcctaacga taaccaatgc agagaatttt gaaacgaaca cattcctaaa tgacgaaatt 420 

cctgatttat cggaagtctc aaatgaagtg cctaacaagg ataacaacat ttccagctat 480 

ggtagcataa tatagtattc caaggataag gaaagcatgc actgtttatt tcctttcctt 54 0 

gcttaattga ttttttttaa agggaacaaa cattttgatt tcaatttcca caagcctaga 600 

ctcttcaaca cataatctgt gggttattgt ttgggaaagc attctccgct agaagaatga 660 

aactggcgct caggtttgat tctataacta cggcagtttt tcctattcta ttttcgtttt 720 

ttgattttcc cgccgcattg gatattcaat tcgcgacgct aataattggc atttcgtgtt 78 0 

cttaagtaat ttcgtgtttc aaataaccgt aaacagagaa agacccaaga atttcagatg 84 0 

gcttagaaga ggtagacatt aaatcaatct gtatgtgatg gagagggagt gtatttaaaa 900 

gacgtaagaa aatgaattat caagatccgt tatggccatc tagtctcttt cttgtacact 960 

agttgtctaa cacaaccaac aaattagaat atatatcgca atg att ttc aaa ata 1015 

Met He Phe Lys He 
1 5 

ttg tgt agt tta eta ctg gta acc tec aac ttc get tct gee tta tat 1063 
Leu Cys Ser Leu Leu Leu Val Thr Ser Asn Phe Ala Ser Ala Leu Tyr 
10 15 20 
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gtc aat gaa act acg age tat aca cca tac acg aag aca tta act cca 
Val Asn Glu Thr Thr Ser Tyr Thr Pro Tyr Thr Lys Thr Leu Thr Pro 
25 30 35 

aca tac tct gtt tea cct caa gag aca aca tta acg tac age gat gaa 
Thr Tyr Ser Val Ser Pro Gin Glu Thr Thr Leu Thr Tyr Ser Asp Glu 
40 45 so 

aca acc acc ttc tac ata aca tct act ttt tac tct acc tac tgg ttc 
Thr Thr Thr Phe Tyr He Thr Ser Thr Phe Tyr Ser Thr Tyr Trp Phe 
55 60 65 

act acc tec caa tea get get att att agt aca cct act gca agt aca 
Thr Thr Ser Gin Ser Ala Ala He He Ser Thr Pro Thr Ala Ser Thr 
™ 75 80 85 

cct act gca age acg cct age eta act acg tec aca aat gaa tac acc 
Pro Thr Ala Ser Thr Pro Ser Leu Thr Thr Ser Thr Asn Glu Tyr Thr 
90 95 100 

acc acc tat tct gac aca gac acc acc tac acg tct act ctg acc tct 
Thr Thr Tyr Ser Asp Thr Asp Thr Thr Tyr Thr Ser Thr Leu Thr Ser 
105 no us 

act tac ata ata act eta tct acg gaa tec gee aac gag aag get gaa 
Thr Tyr He He Thr Leu Ser Thr Glu Ser Ala Asn Glu Lys Ala Glu 
120 125 130 

cag att tec acg age gtc aca gaa att get tct aca gta acc gaa teg 
Gin He Ser Thr Ser Val Thr Glu He Ala Ser Thr Val Thr Glu Ser 
135 140 145 

ggc agt aca tac acc tct act ttg acc tea acc tta ttg gtt act gta 1495 
Gly Ser Thr Tyr Thr Ser Thr Leu Thr Ser Thr Leu Leu Val Thr Val 
150 155 160 165 



tat aat tec caa get agt aat aca ata gcg aca tec aca get ggg gac 

Tyr Asn . Ser Gin Ala Ser Asn Thr He Ala Thr Ser Thr Ala Gly Asp 
170 175 180 

gec gee tec aat gtt gat gee tta gaa aag tta gtc tct get gaa cat 

Ala Ala Ser Asn Val Asp Ala Leu Glu Lys Leu Val Ser Ala Glu His 
185 190 195 

caa tct cag atg att caa acc aca tec gee gat gaa cag tac tgt agt 

Gin Ser Gin Met He Gin Thr Thr Ser Ala Asp Glu Gin Tyr Cys Ser 

200 205 210 



1111 



1159 



1207 



1255 



1303 



1351 



1399 



1447 



1543 



1591 



1639 
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gcg tct acc aag tat gtt aca gtt aca get get gca gtt acc gaa gtg 1687 
Ala Ser Thr Lys Tyr Val Thr Val Thr Ala Ala Ala Val Thr Glu Val 
215 220 225 

gtt act act acg gcg gag cct gtt gtt aaa tac gtt act ata act gec 1735 
Val Thr Thr Thr Ala Glu Pro Val Val Lys Tyr Val Thr He Thr Ala 
230 235 240 245 

gat get agt aat gtt aca ggt tct get aac aac ggt acc cac att 1780 
Asp Ala Ser Asn Val Thr Gly Ser Ala Asn Asn Gly Thr His He 
250 255 260 

taatgcgtga cgttgaatcg agaaaaaaag ctacttttaa cgaaaccttt actagttatc 184 0 

ctatatggga tcactagtat tttttgattt acgattcaat aaatagacta gagacaactt 1900 

tcatatcatt ccttaaaaaa tacataaagc gcaaattcaa ccccattgat acatatataa 1960 

gtagttctat tatgactttc aagaacaata gtagcttttc taaataatca ataagtagca 2020 

caaaatctgt ctgtttgtac gcttatattt agtttgcgtt tatttgegag cgccacgaga 2080 

aggggcagga aaaaaagatc aatagtttgc aataaacatc gaatgatgat ttcaaccacc 214 0 

gatacataaa ccagcgaggc tttcaaggaa gaatgaacgt gaactegtea actcaaaaag 2200 

aaaatgagee agcatattag gaaattagat tctgatgttt ctgaaagact taaatctcag 2260 

geatgeaegg tategctage ateageggtt agagaaatag ttcaaaattc tgtagatgea 2320 

cacgctacca etatcgaegt catgatcgac ctccctaatt tgagctttgc agtttacgat 23 80 

gatggtattg gtttgactcg aagtgaccta aatatattgg ccacacaaaa ttatacttcc 2440 

aaaatacgaa agatgaatga tttagtaacg atgaaaacct acggttacag aggagacgee 2 500 

ctatatagca tttctaatgt ctctaatctg tttgtttgtt ccaagaaaaa ggattacaac 2560 

tctgcatgga tgagaaaatt tccatccaaa agegtcatgt tgagtgagaa taccatactc 2620 

ccaatagatc ctttttggaa aatttgtcct tggagccgaa caaagtctgg tactgttgtt 2680 

attgttgaag atatgctgta taatttacct gtccggcgca gaatactaaa ggaagaaccc 2740 

cctttcaaga cttttaacac aataaaggca gatatgetae aga 2783 
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<210> 30 
<211> 260 
<212> PRT 

<213> Sac char omyces cerevisiae 
<400> 30 

Met lie Phe Lys He Leu Cys Ser Leu Leu Leu Val Thr Ser Asn Phe 
1 5 10 . 15 

Ala Ser Ala Leu Tyr Val Asn Glu Thr Thr Ser Tyr Thr Pro Tyr Thr 
20 25 30 

Lys Thr Leu Thr Pro Thr Tyr Ser Val Ser Pro Gin Glu Thr Thr Leu 
35 40 45 

Thr Tyr Ser Asp Glu Thr Thr Thr Phe Tyr He Thr Ser Thr Phe Tyr 
50 55 60 

Ser Thr Tyr Trp Phe Thr Thr Ser Gin Ser Ala Ala He He Ser Thr 
65 70 75 80 

Pro Thr Ala Ser Thr Pro Thr Ala Ser Thr Pro Ser Leu Thr Thr Ser 
85 90 95 

Thr Asn Glu Tyr Thr Thr Thr Tyr Ser Asp Thr Asp Thr Thr Tyr Thr 
100 105 110 

Ser Thr Leu Thr Ser Thr Tyr He He Thr Leu Ser Thr Glu Ser Ala 
115 120 125 

Asn Glu Lys Ala Glu Gin He Ser Thr Ser Val Thr Glu He Ala Ser 
130 135 140 

Thr Val Thr Glu Ser Gly Ser Thr Tyr Thr Ser Thr Leu Thr Ser Thr 
145 150 155 160 

Leu Leu Val Thr Val Tyr Asn Ser Gin Ala Ser Asn Thr He Ala Thr 
165 170 175 

Ser Thr Ala Gly Asp Ala Ala Ser Asn Val Asp Ala Leu Glu Lys Leu 
180 185 190 

Val Ser Ala Glu His Gin Ser Gin Met He Gin Thr Thr Ser Ala Asp 
195 200 205 

Glu Gin Tyr Cys Ser Ala Ser Thr Lys Tyr Val Thr Val Thr Ala Ala 
210 215 220 
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Ala Val Thr Glu Val Val Thr Thr Thr Ala Glu Pro Val Val Lys Tyr 
225 230 235 240 

Val Thr lie Thr Ala Asp Ala Ser Asn Val Thr Gly Ser Ala Asn Asn 
245 250 255 

Gly Thr His lie 
260 
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